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extensional (or executable) aspects of procedur-rtl skills. These diagnostic 
models provide not only a technique fot modelling the underi ying' or deep 
str^ucture aspects of a procedural skill but tliey also suggest that an 
impiortant criterion for modelliri|^Xognitiv0 processes and their related 
knowledge representation is thatjof finding a nnrnrnl way to account for 
all possible manifested errors i^ human performance of that skill. The 
second chapter describes a considerably more complex theory/ technique ^ 
^ for analyzing the problem solving trace; or protocol of a^ student and 
then automatically synthesizing a model of his problem solving strategies 
as veil as the motivations or "'plans'' that! he used to- guide him j.n his. 
solution. This t'heory captures the subtle reasoning powers of a master \ 
tutor and as such acts as a powerful modelling -technique of a learner 
which is needed for guiding oua^^omputer-based laboratory tutor as well as 
providing a new methodology for measuring how a <6^udent's pro^jlem solving ' 
performance is evolving. This theory also forms- a cornerstone fotr- 
building information processing models of ma-ster tutors. 

The instructional pa/radigm being developed is quilt* different from the 
classical CAI or CMI approa'cties . -flere, we are fot using on techniques 

. for teaching procedural knowledge and reasoning strategies which are best 
learned through hands-on laboratory* or problem-solving tasks in whi,ch the 
student gets a chanceX^ exe'rcise his knowledge urider the watchful Stid 

^critical eye of an automated intelligent tutor. Our instructional systems 
attempt to mimic the capabilities of a »larboratory instructor who works 
on a one-to-one basis^with a /trainee and who can carefully diagnose what 
the trainee knows, how he reasons, what kinds of deficiencies exist in- 
his ability to* apply his factual knowledge afid so.^n. 
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;i.....^**f -^. -i^* '^^"^^ >epqrtsVUCAXi«,6,7.] .whicTi document our 

receipt ^ investigations into • -a theory, for automaticaUy inducing, and using 
(structural) models of. a student wljiich explicate his reasoning strategies,' 
his representation of procedural skills and his underlying misconceptioi^s, 
as .manifested in h^s errors. . Our basic methodology'ha^. been to explore 
segments of the modelling problem in the context of particular' knowledge ■ 
domains, and to implfement tentative- theories in the form of prototype 
intelligent instructional systems. 'This methodology not only provides us a 
^te'st for the completeness and usefulness of our theor\es, but equally 
rmportant it provides us an opportunity to develop and experiment • with 
tutorial strategies which utilize the kind of deep structure model of a" 
learner which was, heretofore,, impossible to draw upon. 



. Before proceeding, we sh^ould comment on why Structural student models 
US'^ op.Posed ,to, .simpler^ parametric models) are critical to the kind of 
.'instructional paradigm^ being developed .under this Tri-service contrjact. 
■One ^of the classical gqals of ' CAI has been to produce adaptive 
instructional systems which transform textbook and classroom 'type learning 
into . self-paced individualized. instruction. Learner models for directing " 
this kind of instruction require very little detail with Vespect . the 
reasoning capabilities and underlying knowledge represfntations'of the, ' 
particular learner. For example, parametric models based on a . factor 
analysis of a student's,, performande , or Markov models Jbased on'ibsefved 
transition prpbabil-ities, often capture all the inforiafelon that is needed. 
No'te, however, that the parameters of such models dQn*t 'reveal very much ' 
about ^the infinite variety, sub^tlety'- and structure of' the reasonipg- 
stra-tegies;aihd^roblem,sorving' heuristics of t'he student^; nor do^ they, in 
themselves i • rlflect any of his aeep-seated misconcepfions. In .part this*^* 
^fundamental I'imitation arises' ;from tVe fact that there are only a. finite 

( ^ A / ' . • . • 

(and usually small) Tiumber hf! parameters which can ^represent only a. finite 
number bt pr^^teMjjjne^^ "entities"/ In other ^words/Vthese models are 
basically extensianal with.no generative capabilities. - 



The instructional paradigm being developed here is quite different 
Ijfhom vihe classical CAI or CMI approaches, In- particular, |we* are not 
fodusih^ on techniques for teaching factual, textbook knowledge (which can 
often be corapetent;ly handled by the f^ame•oriented<^ CAI or CMI systems).,^ 
Ii)stea(l, w| are focusing on techniquesXfor teaching procedural knowledge 
'and reasoning strategies whlbh are best learned through hands-on laboratory 
or problem-solving tasks during which the stuHent gets a chance to exerqise 
his knowledge under the watchful and crirksal eye of ah automated 
/ intelligent^ tutor. Our instructional systemk attempt to mimic th^"" 
capabilities of a laboratory instructor or •'caiQch*' who works^ on a* 
one-to-one basis with a trainee and who can carefoily diagnose what the 
trainee knows, how he reasons, what kinds of deficiencoes exis£ in' his 
ability to aoplv, his factual knoi^^edge and so on. The ins\ructor, then uses 
this inferred knowledge of the trainee -to determine how b'^st to critique 
*and/or kibitz with him. 

This report descrit)es some techniques and a beginning theory i^r how a 
. computrer- based "intelligent" la^Joratory, * instructor ' ^(or on-the- jqlj-site 
trainer) can extract and use such infoi;mat^ion about the learner. Thjs ^rst 
chapter discusses the concept of a diagnostic yodel , which is based*^ on the 
concept of a "procedural network" - a networks having many of the prbpertiesN 
of the older style semantic networks but which captures both the 
intensional and extensional (or executable) .aspects of procedural skills. 
These diagnostic models provide not only a- technique' for modelling the. 
underlying or deep structure aspects of .9 -procecjural sk'ill but they also 
suggest that an important forcing function for modelling * cognitive 
processes ,and their related knowledge ^representation is that' *of -f indifife a 

natural way to account for all possible* mani/ested errors in^ human 

* ' ' ' ' * ' - 

performance, of that skill^* ' . 

The second chapt.er describes a consider^ably " more complex- 
the6ry/techni(iue foV examining the problem solving trace or protocol of a 
^ student anfl ^automatically synthesijsing, from the trace, a mod^l of his 
, problepi solving strategies as^well as the motivations or "plans" that He 



used X to guide hi«" in his solution. This theory begins to capture the 
subtle re.asoa4ng powers of, a master- tutor and as such not only acts as 1) a 
powerful learner modelling technique (useful for guiding our computer-based 
labjinstfuctora as well as providinig a methodology ^or • measuring how"' a 
stAident's problem ^solving performance is evolving as a result of some^ 
instruction) but aiso as 2) a cornerstone . for building information 
processing models of ^e skills of a master tutor.. 
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" ' ' ^ CHAPTER 1 - ^ , ^ 

DIAGNOSTIC MODELS FOR PROCEDURAL SKILLS 

• . "If you can both listen to students and atcept their answers r^ot as 
things to just be judged right ojPVrong but as pieceg' of information ' 
which may .reveal what the ^ent is thinking you will have taken a 
giant step toward becoming a master teacher rather than merely a 
. disseminator- of itiformation. " --^J.A. Easley, Jr.. 4 Rusael -E. iwoyer 

1 ♦ 

Until' recently our- -efforts in ■ ^constructing "intelligent" 
,knowledge-^ased instructiqnal systems (ICAI) have been primarily focussed 
on endowing cdmputers with sufficient expertise to aniwer a student's 
questions, critique his behavior, and in some cases, help him'debu§ his own f 
"Understand ing;' Although such ej^rtise ,is\ necessary for sophisticated - 
training systems, it is by no mea^ the whol« story. . Master tutors have 
skills that transcend thej^f particular field of 'expertise, "^ne of . their 
greatest^ talents -is the artful synthesl^ of an accurate "picture" of a ' 
student's misconceptions from the meager manifestations reflected in his" 
errors. An accuf^te picture of a student's capabilities is a prerequisite 
to any attempt" at direct individual remediation. The pictures of students 
that teachers develop (in whatever form) are often called "models", ^he 
form, use and lndUQf4on of such models for ^prpcedural skills Is the topic 
of this^chapter, In particular, we shall descr<ibe some initial efforts in > 
the. -development and use of a representational technique called "procedural 
networks" as the framework for constructing diagnostic model. s of procedural 
skills.- A diagnostic model attempts to. capture a student's common 
misconceptions or faultV behav^nr^ as simple changes to (or mistakes' in) a- 



correct model. - 



This chapter consists of four sections* The first; describes a domain 
Of application and provides examples' of the problems which must .be faced 
•with a diagnostic liiodel. ^e sectfnd introduces procedural networks as a 
general framework for representing procedural kfiowledge underlying a skill 

• \ - ' • • • . , 

(1) A version of this chapter has been- accepted for ^publication in the 
Proceedings of the National Association of Computing Machinery, 197?. 

• * * ' '- ' ' 



Sample of -the student's work: ' 

"1 328 ,:: .,.989 ' 66 216 

-tl *^ *887 +13 



Once, you have discovered the bug, try testing your hypothesis by. 
"simulating" t^iat bug. and predicting .the results on^the following two test 
problems. 



'JM6 201 
, +815 t-399 



{ 



The bug is really quite simple. In cpmputep- terms, the student, after 
determining the carry, forgets to reset the "carry register" and hence the 
amount carried, is accumulated across the columns.' For example, in the 
secqnd problem 8+7=15, so he writes 5 and carries 1; 2+1=3 plus the on.e 

carry, is 1. Lastly 3+9=12 but^that one , carry from the first column is 

t ' ■ - , . . . 

Wasn't- been reset — so adding ft in to this column gives 
13« if. this 'is the bug, then the" answers to the' test problems will be 
1361 and 700. This «bug" is not so absurd when one considers that a child 
might use his /^ingers. to remember tfie carry and ^forget to bend back his 

fingers', or bounters-, after each carry is added. 

' / ' - " . 

^ A common- assumption among teachers is that students do not follow 

procedures well arid, that, erratic" behayior i3 the primary cause pf a 

student's inability to perform, e^ch individual step correctly. Our 

experience has been that students .are remarkably able- procedure followers. 

byt -that they oft«n -. .fqilow the wrong procedures. One case encountered 

last: year is '(Sr Special, inter^^^ '^^^ student proceeded 

through a ^poi portion of the -school year with' his teacher thinking that he 

vap exhibiting random behavior in his performance of arithmetic. As 'far 

as^the teacher was concerned there was no systemetic explanation' for his 

errors; and, we must admit that before we :had "discovered" his bug we, too, 

^oughtrthat he was erratic* Here is a sampie of hi^ work:^ 




/ 



Z* ■ 9 "8 6 ' 8 9 Mi^ r9 
JiSl ±5 ±i ±2 ±8 +9 •+8 » 
'15 11 .13 , -TT 18 ^ n 



87' 365 . 679 

^' m ^ 



923 c27,M93 ^ 797 : 
♦^81 ♦1 .509 ^^^,^'^2 
2F7W1 ' 1^8,119 



/ 



There is a clue to the nature of his bug in^ the number of ones in his 
answers* Every time '£he,. addition of* 'a column involves a^carry, a one 
mysteriously appears in that column; he is simply , writing down the carry 
digit aod forgetting about the un.its digit! One 'might be misled ^y 17+8 
which normally involves a^carry ^yet is added corjrectly. It would seem 
that he is able to do simple additions ty a completely different procedure 
— possibly 'by counting up "from th6 larger number on his fingers. 

The manifestation* ^^^thi§ sti^ent's simple bug carries over to other 

types of problems which involve^addition as a subskill.^^^hat answer^ would 

he give for the following? - ■ 

A. 'family has traveled 2975 miles on a tour of the U.S. Xhey have 18^8 
miles to go.. How jnany miles will they have traveled at the end of their 
tour? ' ^ * - ' 

He correctly, solved the word problem to obtain *the addition problem 2975 V 
1825 to /Which he answered 3191. Since his work was done on a. scr^ch 
sheret, the teacher only saw the answer which is, of course, wrong. As a 
result, the teacher assumed that he had trojible with word problems as well 
as arithmetic ' ^ , ^ , V: 

When we studied this same student's work in other arlthme$.ic 
prpcedures, we di'scovered a" recurrence of the same bug./ Here is a sample 



of his work in multiplication: 

^ . 68 73'< 543 

" xi<6 x37. x206 

~2ir* -792 -Ul 



758 2/6J<\ 
- x296 x53 > 
<^ "IT? 2731 
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There 'are really several bugs manifested herei-the most severe one being 

that his multiplication algorithm, mimiQ4 'his addition algorithm** But 

^ * * ' / " * 

notice 'thaft. the bug in his addition ^Igorit^im above, is also present in his 

multiplication procedure. The "carry unit" subprocedure bug , shows up- in 

both his multiplication and addition* ,fpr example, to do 68x46,- in the 

first c'Qlumri he- performs:' 8x6, gets ancf \hen writes^' down the . "carry" 

which in this case is\4, ignoring the .units digit.. Then he multiplies 6x^4 

to get 2 for the second columr)^^ All albi^ he ha3 a completle and consistent 

procedure for cjoing arithmetic/. His answers throughout :^11 'Of h% 

arithmetic, work are far.frim random. In fact they d^ple^y near , pjerfectipri 

with respect ^to his way of doing it.. 



i A First Approximation to Representing Procedural-Skills 

In order . to build J computer system capajble of diagnosing aberrant 

behavior such as the above,/ the skill being taught must he represented in a 

form amenable to mode*tiri^ incorrect as well .as correct ' procedures. 

Additionally, the model /should break the skill down into shared sub-skills* 

* / \f 

in order to accouat for /the recurrence of similar errors in different 

/ ' ' ^ ' ' ' '* 

skills. -We 'use the/ term diagnostic model to mean a representati^gn that- 

depicts ^ student's internalization of a skill as a variant of a* correct 

version of the skill/ For a representation of .a correct skill to bel useful 

as a basis foe a d/agnostic mqdel, ft must make explicit much of the tacit 

Jcnowledge uhderlyihg the skill. Jn particular^j it mu»t contain ail of the 

knowlejige that /can possibly be misunderst09d by a 'Student performing thh 

skill, or else some student misconceptions will be'^Bfeyo?!^ the diagnostic 

modelling capabilities of .the system. -For example, if the'modelof- 

/ ** ' 

addition doesn't include the transcription qf the problem,, the system wouldt 

/' - *; * • ^ . ' . ^ p 

never be abi/e to diagncrse a student whose bug was to wr^e 9's which ' he;, 

* / / - ' * * * . ' ' ^ 

later misread as 7 s. ^ 
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The technique we use to represent diagnostic models is a procedural 
network > A procedural network consists of a collection of procedures 
(with annotations) in which* the calling relatiJnsJiips between procedures 
are made explicit by appropriate links in the network* Each procedure hode 
has .two main parts: .a conceptual part representing the intent - of the 
procedCire, ^'^nd an operational part consisting of methods for carrying out * 
\ that intent. The method? (aflso called. im|?lementation5) are programs tliat 
define how fche results ^ of .other procedures' are combined to satisfy the 
intent of a particular procedure* , Any procedure can. have more th^n one/ 

implementation which provides a }isry to model different dethods , for 

* - •* ' , * * 

pe.i^forming the same procedure (skill). For most skills, the network 

■ representation take's the form of a lattice. I^igure 1 presents an example 

of how a p^trt of ,the addition process • is partially .broken down into^ a 

procedural network; C3&nceptual procedures are enclosed in ellipses. The . 

. . . • . - ^ 

A 

top procedure in the latt.ice is addition. Two jof th^ possible 

algorithms for doing additipn are presented as Alternative methods. In ' 
' method 2, the columns are add^ from left- to right with any carries being 
written below the answer in thevnext column to the left*. If there, are an/ 
carries, they must be added in a second addition. In method 1, (the 

* ^ ' , V ^ ""'^ ] , ! - ' 

- (2) Thisf term has- been used by Earl Sac^rdoti [1975] to describe .an 

interesting modelling technique for a partially ordered^ sequence of ' 
annotated steps in a problem solving "plan**. Our use of pnqcedural nets, 
differs from, and is less developed, than his. The extensi've' treatment of 
the structure and use of our networks is being reported in ^ companion 
paper. [Burton and Brown, forthcoming] * . . , 

(3) The language we have used is LISP. The particular^^programming language, 
is unimportant from a theoretical standpoint because an .implementation ^is 
non-intpospectable. The mpdelling^ aspects of the network must occur at the 

'.conceptual procedure level. For. example, .th^ implemelntation of- the 
subtraction facts table look up procedure in the computer is' necessarily 
different from that in the student. Howev^er, .the conceptual properties of 
the facts table, procedure are the same in both. Those aspects which are 

M the-, same (e*g., the invoking of other procedures* the values returned, the 
relevant side effects) are included in,^ the netwprk> while the 

"implementation details, which may differ, are "swept under the ryg" into 
the program. This is hot a limitation, as/ any "implementational issue" can 
be elevated to the conceptual level "by creating a new conceptual, procedure 
in between^ the existing ones. The distinction between coYiceptual anjd 

• implementation, details can also be used to allow a single network to mbdel 
a ^Ijill efficiently at different levels. - ^ 

(4) This is a simplified representation intended only to demonstrate those 
features of the procedural network particularly relevant to the diagnostic 
task. The actual breakdown into subprocedure^ may be different " in a 

'•particular network, and will be considerably mor*e detailed.' 
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1 

standard algorithm) the columns are added from right to leCt with any 
carries being written above (and in^luded^i^p the .column sum of) the next' 
«ol-umn to the left. Notice that • the^e two ^ methods * share the common 
procedures for calculating- a column. sum and writing a digit in the answer, 
but differ in th§ procedure. they use when carrying, is necessary. One 
structural aspect of the network is to make explicit any sub^rocedures that 
can be potentially shared by aeveral highej:> level procedures. 

j: insert Figure .11^ 

The decomposition of a complex skill into all of its conceptual 

procedures •termin^i^es in some set'' of primitives that ' reflects assumed 

elements of an underlying computational model. For addition, typical 

primitives are: recogr^zing a digit,' being able to write gu' digit, and 

knowing the concepts of right., left? etc.. The complete proced*ar^*netwir| 

(explicitly specifying all the iubprocedure^^ a skill) can be evaluated 

,t^®''®''y simulatirdg the ^^afcill for any given set of inputs. 

By itself , this merely providesja-tfomputational machine which performs the 

skiir and is n6t of particulap import. ^However, >5.the possible 

"misconceptions" of this skill are represented In the netw;*lc by "buggy" 

implementations associated with procedures in the decomposition. » Each 

buggy version contaiYis incorr^t actions .taken in place of the correct 

ones. An extension to. the netwohk evaluate^ enables the switching in of a 

JiUfigl version of a procedure, thereby allowing th^e network to* simulate ^^he 

^behavior of that buggy subskill. This provides a computational method for 
V . a* - . ' 

detj^rmining the external behavior of *the underlying bugs! -^^ ^ 

\ > • ' y * 

Inferring, a Diagnostic' Model of the Stydent , - 

The problem of diagnosfng a deep structure failure in a student's 

knowledge of a procedural skill ' can now be accomplished, at least 

theoi*etically, in a straightforward banner. Suppose^^as in the examples on' 
• . J^' * * 

page 4, we are < provided with general surfacg, manifestations of ^a deep 

structure misconception or bug in the student's addition procedure. ' To 
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uncover Which possible subprocedures^^^^e at fault, Ure use the network to 
simulate tne behavior of buggy subprocedures over tnel set of problem^*, and 
note .thoie which generate the same behavior as'^xhibited by ithe student* 
To ^^itch a \ student's mi3conceptions .that irlvolye more 
subprocedure , we must, be able to-simulate various •coinbination4 of bugs* 
For example, a student may have a bug in his carrying procedure as well as 



table) , 



To model his 



believing that S-h? is 17 (a biag in* his aSdition facts 
behavior^ both buggy versions must be'" used togetherjj 
TOd^l ol^, ^the student's errors is a set of, buggy subprocedureb which, whlen 



A • deep structure 



invoked, T Replicate those ecrors. Each buggy version has associated 
infonfiktion-ji^ such as the underlying teleology of* the bug, . specific 
remediation^, explanations, examples and so' on. These may be used ^by a 
tutoring system to help 'correct the student's problem. 

Relationship or Diagnostic Models to Other Kinds of Structural Models , 

It^ is beyond the scope ^ of this paper to, discuss all the past and 
current work on structiiral models of students and how it relates to 
diagnostic^ models based on prdcedural networks* However a Yew words are 
in order^ Most previous and current research on this subject has been 
focussed on the intuitively appealing which postulates ^that if one 

has an explicit, well formulated model of the knowledge baSe'^'of an<,. expert 
(fbj^ I given set *of skills or ^a problem domain) then' one can model a 
particular student's kn^ledge as a contraction or simplification of the 
rules comprising the.\^xpert [Collins, Warnock and Passafiume 1975, filrdwn, 
Burtcfn and Bell 197^, Burton and Brown 1976, Carr and Goldstein 19-77]. 
Recently, Goldstein has articulated this concept in his Computer Coach 



(5) -Additional structure in the network helps resolve what combination of 
. bugs are worth considering. In general, simulating or evaluating all 

^simple and multiple bugs takes approximately 2 cpu seconds^ for the addition 
and subtraction procedural nets. 

(6) infest [197'T] has broken dowrr t|ie diagnostic teaching task^ into / four 
steps: 1) distinguish between conceptual and careless errotrs; 2). identify 
the exact nature of the conceptual error (bug); 3) determine the cb'hceptual 
basis (cause) of the bug; anc| M> perform the appropriate remediation. ^ The 

i system we describe h^s been directed towards problems (1) and (2). Tl^ 
buggy implementation nodes in the network provide the ' proper places 
attach information relevant to prbhlerns (3) and 
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Vesearclx and has coined the term' -"overlay model" for capturing >iow a 
st'iident's^ manifested knowledge of s^k*iils (rules) relates to an expert's 
knowledge base [Goldstein 19773i. In all these ioases, the. primary* problem 
has been to develop techniques to discover 1) whicli skills were employed by 
the 'stude'nt in solving problems, 2) which skills were not.usedj and 34 
whi<?h-skills an expert would have u^ed which the student did not. 

The" work reported in this paper 'differs in emphasis from.^ such 
approaches in that the basic modelling technique focuses on^ Viewing a 
structural model of the^ student not primarily as. a 'simplification of the 

expert's rules but rather as. a set'^of semantically meaningful deviations 

7 • ' ' ^ ' 

from an expert s knowledge base. That is, each subskill of the expert 

is explicitly encoded, alofig with a set of potential misconceptions of that 

\ . ' 

subskill. The taisk of inferring a diagnostic model then becomes one^af 

discovering Which set of variations or deviations best .explains the surface 

^behavior of .the stbdent* This view is in concert with (although more 

structured than) ^the -approach taken by Self [ 197^] *inx which he models the 

student as a set of modified procedures taken from • a procedural - expert 

problem- solver . ' i 

,We, shall now consider exanvples of procedural akills«?in arithmetic, 

evaluations of the networks for these skills^, ^nd then we-I'shall ihift our 

focus to some pedagogical -uses of the procedural network notion', j 

Procedural Knowledge Used in Subtraction , ^ j ' ' 

To provide an example indicative of the surprising ♦ araouhlt o 

procedural knowledge needed to perform a eimple skill, ^let 'us consider k 

^ 8 

more complete network- representation of the subtraction. of two numbers'^ ^* 

\ " ' ' ' . ■ - ' 

Figure 2 shows the . rinks.j*0<^ the procedural network for subtraction that 

(7) Because these deviations are- based on both the student's intended goals 
and underlying teleology of the subskills, we havQ no automatic way to 
generate .them (as^ opposed to what could be done. If the deviations wei*e 
based on the surface syntax of the rules). How.ever, ongoing wprk *by 
Goldstein and Miller [1976], .Rich and Schrobe [197^] and Burton and , Bro)^n 
[forthcoming] will eventually help overcome this limitation. 

(8) We have chosen just one of the several subtraction algorittwis! /(tfe 
so-called » "standard" algorithm) but the ideas presented here apply equally 
to; others. ' . ♦ * 



indicate which procsedures a procedure may use. The network has\^een 
simplif-ied by showing only one implementation of-^each procedure (i.e. \ the 
one taught in the ("standard" algorithm). 



[insert Figure 2] 




The top most node ^B-esents the subtrac^on -of two n-digit numbers. 
It may us^ th^ ' procedofe for: setting up the" problem, transforming it if 
' the bottom number ^s greater than the top, and .sequencing" thro'ugh each 
column perf<irming ihe column subtraction. The implemelitation of the latter 
has to account /for cases where borrowing is necessary anc^aay calj. upon 
many separate subprocedures including taking the borrow from the qorreot 
place, scratching 0 and writing 9 if that place contains a "zero, and so on. " . 
An important subprocedure is the facts table look-up where any of the-' 
simple arithmetic facts can be wrong, » including the addition" of 10 to a- 
columnr . digit, the subtraction' of 1 during a borrowing operation, or ahy 
subtraction facts used. during the processing of a column. 

In principle, each of these subprocedures could' have many buggy''" 
versions associated with it. An example of a. common bug is. to calculate ^ 
the colujnh difference by subtracting the smaller digits f.rom the larger 
^ regardless of which is on ' top. In another bug^, the set-up proeedure 
left-justifies the top and bottom numbers' so that when the student is told 
to subtract 13 /rom 185, -he gets 55-.. One interesting thing about the left 
justification bug is that the student w,ill be faced with seemingly 
impossible problem? (185-75) and may be inclined to change the direction in 
which he^ subtracts, borrowing from left to right- instead of from right to 
left/ or to change his column difference procedure to' larger minus smaller. ' 
thei^eby eliminating the . need to borrow. Thus, there can exiit ''"^ 
relationships between bugs suc'h that one bug ' suggests^ others. ' A major 
.challenge in identifying the procedural breakdown or description of a skill 



is to have the network' naturally handle raaifldat ions' ahd interactions" of 

/ , • ■ - ^- ' ' Ac • 

/X9) On, the ayer^ge^our network has two to three buggy ' versions 'fo> each 
/correct version of a subprocedure. ' 

' ' • * ■ ■ ^. ■ ■ • . - 

ERIC * . ; . ' " ' * ~ 



^ y^:^ multiple bugs, as well as to provide a natural 'way to define and ^handle all - 
conunpn bugs. , ' . 

Exhaustive Evaluation of the Network ^ ^ ^ ' 

Given a procedural network like the one ifi Figure 2, it is not always 
obvious how . bjigs in any partixjular subprocedure or several *subprocedures 
, will be manifested on the surface (i.e. in the answer) ~ especially 
since bugs can have serious interactions or since "a single^ buggy 
subprocedure can^be used by several higher-order procedures- in computing an 
answer^* In fact, if asked to make predictions *abolit the symptoms of a, 
given bug, people often determine the symptoms by considering only the 
'Skills or subprocedures used in solving g^SL particular sample problem. . As 
a result they often miss symptoms generated by other procedures that can, 
in principle, use or call on the '^ven buggy subprocedure but which, 
because - .pT the characteristics of the particular prob4era> weren't called. 
Yet if another Sample' problem were chosen, it would have caused the 
particul'alr faulty subprocedure to have been used for a different purpose or 
in a different way, thereby generating different sympt'oms. determining the 
complete iet of symptoms for a bug is- further Complicated by the fal2t that 

/ sometimes a buggy su()procedure can ^^be called by 'several higher order 
procedures , in . the midst of solving just one piT^blem. It was this 
observation that first led us to consider the diagfipstic value of this 

. scheme for systematically verifying a cdnjeyctured bug. / ' %~ 

In order to provide a feeling for the ran^eTor "answers" that can c'ome^ 
from simple underlying bugs, we have incited in Figure 3 the "ahswers" to 
a subtraction problem (15300-9522) usihg ^ome of the bugs in the 
/^ocedural ^network for subtraction. Fpr ^^xaniple, the answer 1M222 was 
generated, by the bug which subtracts the ^'maller^ digit, in «ach. column, 
from the larger. Appendix 4 gives one bpief explanation of a bug^that 
would generate each of the answers in Figiire 3. - 



er|c 



13 20 



957j 



15300 
-9522 
11222 



15300 
-9522' 



Figure 3 

«Wanifestations" of 'Some' Sdblf^tion Bugs 
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Of course, a particular "answer" to a given problem can have more than 
one explanation, or cauae Since there can-* be several distinct bugs that ^ 
"generate .the same "answer". For .example, a* studehf , may hani)of many 
misconceptions and, still get the correct ^hawer- to - a particular ^pnjoblem^ 



The need for teachers to thoroughly apt)re.ciate arid strategically cpjie with 
the possible' range of student bugs led us to. construct a game 'Sailed 
BUGGY. ' ^ ^ -r-^ 

' - . V~ / - • ■ " . ■ ■ 

BUGGY - An Instructlfenal Activity y 

BUGGY is a computerized game b ased on the diagnostic '^^teractlons of a 

, * , . 

teacher and a computerized student. The .teacher s role may be played by 

one or more persons* The teacher is presented with an » arithmetic, homework 

problem that the "student" has done incorrect-HJ^ The "student's" behavior 



is sgenerated, using a procedural network, and manifests an underlying bug 

) ~* ^ ^ ^ . ^ 

'in^ one of th'6 arithmetic subprocedures. The teacher's job is to diagnos^^^^ 

the computerised student by providing strat^fetp^ test ^roblenRs .- ■ for the 

"student" to solve, in order to discover exactly what the. underlying bug or 
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.misco.nception is.*' The .problems given by. the teacher are answered by "the 
"student" u^ing the bugged procedure. When thtf. teacher thinks' he knows the" 
bug, he signals the computer program by pressing a "got it" key. PUGGY 
then asks the teacher t.o describe what -he thinks the 'bug' is. To make 
certain that he really has found the bug, a five-problem test is given in 
which the teacher must answer the problems in the same way that 0k student 
would do them,' i.e. he must simulate the "student's" bug. Success is 

* * 

achieved when • all five^ problems are -done "correctly". Then the, teacher 
proceeds to a new bug. . , . • ' ' 

The- follQwing Is a protocol of a team of teacher^ using BUGGY.' The 
dialogue is interspersed with qommentar/ about the issues 'that we feel are 
pedagogicalfy importantV Those lines typed 'by the team are marked with a 
vertical line in the left margin. , . ' ' * 

ProtOGol of a Team Using BUGGY * • 

welcJdme to BUGLAND. ~ *• ■ . ^ 

I (AVE CHOSEN >_£UG: HERE IS AN EXAMPLE OF THE BUG. ' " . 

.17 )' 
-13 . ^ • - . . 

The systeb presents an example in which the student doesn't get 
the right answer. At present, these initial examples are stored 
in the network linked to the buggy implementation. 

NOW YOU CAN GIVE ME PROBLEMS TO DETERMINE THE BUG. . i 

18. M3 :, ' . ^ . 

+6+79 

A- V • /" 

The team generates problems thatJ are answered by the "student". 
- Notice th^t the team's test problems are similar to the • initial 
■ . example generated Ijy BUGGY. In general this-ds a poor diagnostic 
strategy, since it may leave other symptoms of a bug 
undiscovered. >, , ' 

I Got the bug! - * ' 

. The team forms an initial hypothesis, and tells the 3y3tem they 
' have discovered tjhe bug. The system now . asks the team to 
describe the bug and then tests them .:on - their understanding,. 
Asking for a description forces the team to^verbalize their 
sometimes fuzzy idea -of what the bug is and/ often makes them 
realize hq||ill-defined or vague their idea.^tually is. 10 



(10) In the preiseht system, the descriptions are saved but 'they are not 
currently, used, since the understanding of these utterances is beyond the 
capabilities of- any existlngj^atural language system. 
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PLEASE DESCRIBE Tal^qS YOU THINK I HAVE. ' . / , ^ . . t 

(Student adds-th^/^umber to be "carried into ttie last/column ^ ' , ^> 
worked ia. ' / / \ * 

* / • Jl 

'\ This cryptic description is "typical of those we hkve obsexve^d. 
, . * Describing bugs in 'English is surprisingly ^difficult. What the 

' team trying, to say is that if a coliimn add^ to more than 9, 
the.^ units dl^it and the carry digit are added together and the 
re^uH is writ'ten as the answer. Fpr example,, in -'the first 
column of the ^example, ♦problem, when 7 and 5 are added,. 1 (the^ 
— carry) and 2 (the ninits 'digit) fre aldded to get 3. The same with" 
' ' 8+6» 2+9, and 3+9. (This'bug could easily, arise if the students 
^ were u^ing method 2t^from^J.gure mistakenly wrote the carry. 

-un^er the same -column.) iTHe hypothesis makes sense. 
Unfortunately, it is not the student's bug. Notice the' possibly 
humorou&r probably disastrous -results that would have , transpired 
^ if ^the team had stopped at this point and tried to tutor the 

student o,n the basis of their initial belief! 
-/ 

HERE ARE SOME PROBLEMS TO TEST YOUR THEORY ABOUT THE BUjG. 

WHAT IS:/ ^ . , ' * , • . ^ ' 

'To ensure that the team has found the correct bug, the system 
generates .several problems for the team. to solve. (At present, 
y BUGGY has a lieuristic problem generation program that gives a 
broad - range of "different" problems. These are then filtered to 
try t9 get three prpblems which cause symptoms and two that give 
correct v answers. ThiV simple routine does surprisingly well at 
producingtcritical promems. So. well, in fact, that a- more 
"intelligent" routine' w^ich used meta-evaluation to determine the • 
consistent ^set of bugs- and then filtered the test problems'^to . 
split the set was discarded as unnecessarily .time-consuming. ) 

21 ' . 

+39 * , , , ^ ' . 

. _ The team's hypothesis is revealed as incorrect by the first of 
# the five test problems., 'BUGGY - presents its answer and 'then 

' allows the 'C^am to. enter more test problems. * * 



THAT'S NOT TrfE BUS I HAVE. 
USING MY. Bl^G: 

.21 • • 

15 . . . ^ 

TRY GIVING ME SOME MORE EXAMPLES. 

[ 51 99 68 . 

J^m? +99 +1 ^ - • t * 
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- ' After (failing the. test, the team gives ;a - greater range of 
• ^ problemi, ; specif icaliy those using ' ^bigger numbersi- How is it 

. that 5U1707 can add up to 21? By Broadening the scope of 
example problems', they arfe able' .t-o reevaluate their initial 
hypothesis, dismiss It, and develop another one that is complete 
. aud consistent with, respect to the new errors, 

I Got the bugl , 
PLgA^ DESCRIBE THE BPG YOU^. THINK HAVE. - ' _^ 

jThe student is adding all the digits together* 

0 * - , ' .... '<>■•,_. 

HERE ARE SOME PROBLEMS TO TEST YOUR THEOBY ABOUT /THE BUd. " / 

WHAT IS: ' . . ^ ^ . , . . " . 



.33 1091 8 28 *50' • ^ ' V ; 

_i60 ±2i ±10 jt6. 
1^ 17 • 15- 17 15 

- This time the team's hypothesis passes the test and they are 

given the' description of the bug, stored in the ^procedural 
network* ' ^ ' 

VERY GOOD. MY DESCRIPTION OF THE BUG IS: 

THE; ISTUDENT ALWAYS SUMS UP ALL THE DIGITS WITH NO REGARD TO COLUMNS.. 



Pedagogica l Issues 

One application of BUGGY and the "diagno'stic model" view of prc^edural 

skills lies in the domain of instructor training. The realization that 

^random" errors ^are actually the surface manifestations of an underlying 

bug in a procedure is a major conceptual breakthrough for many instructors. 

Often behavior that appears to be random has a simple, , intelligent , anS 

complete underlying explanation. By proper diagnosis, remediation can be 

directed towards the specific weakneisses. The importance of this notim^- 

• ' - ' ./ ' 

cannot be overstresse^d. Admitting the posgiibillty oT .underlying bugs is 

1 ' \ 

critical; to remediation- in the classroomN W^hout the ability to diagnose 

procedur^al bugs, failure on a part|cuiar problem must be viewed as t:' either 

carelesshess ^or total algoritfjm failure. -. In .the first "case, the 

r^emediatlon consists of giving more problems,' while in the second, it 



24 



\ . ' 11 

consists of going over the entire algorithm. When a student s bug 

(which may 6nly manifest itself occasionally) is not recognized by the 

instructor, the errant behavior must be explained as carelessness, laziness 

or woiT^e. This causes the instrJjctor to adapt his model .of the student's 

capabilities, thereby / mistakenly lowering his expectations. From* the 

student's viewpoint, the sitij^ticfh is even worse. ^ He fs Hollowing wh^t he 

believes to be the correct algorithm and, seemingly* at random, gets marked 

wrong / This situation, can be exacerbated' by improper diagnosis. For 

example. Max subtracts ' 284 from 437 and gets 253 as an answer. Of course, 

says Ihe instructor , "you forgot to subtract 1 from 4 ^iri the l\undreds place 

when yoiu borrowed." Unfortunately Max's algorithm is to subtract the 

smaller digit in eaxxh column from the largerV. Max doesn't hav.e^ny idea 

what the instructor is talk-ing about (he never "borrowed"!) and feels that 

he "must ^be very stupid indeed not tb understand. The instructor agrees 

with this assessment since none of his remediation has had any effect on 

Max's performance. . - ; ■ . 

• . ' - 

• . BUGGY, in its present form, , presents instructors with examples of 
buggy behavior and provides practice .in diagnosing the underlying causes of 
errors. Using BUGGY, the instructor gains experiepce in forming thepries 
about th^ relationship between th6 symptoms of a bug and the underlying bug 
itSBlf< This experience can ^i>so b4 cultivated to make instructors a^ware 
that there are methods ojr strategies that' they can use to properly diagnose 
bugs. There are a number of s^-categy bugs that instructors may have' in 
forming hypotheses about a student s misconceptions. The aevelopment of a 
good ^^rbublesh^Qol^ing" strategy by-^n instructor can avoid these pitfalls. 
^ common, mistake is to jump too quickly to one hypothesis. Prematurely 
focussing on one hypothesis can cause a teacher to be unaware that thei^e 



are man|r'^ competing hypotheses tha.t are just as likely, or possibly more 
likely. A coaimon consequence of this is that the instructor only generates 



(11) In computer programming metaphors, this corresponds tOi,-the debuggings 

activities, of resubmitting the program and throwing the whole pft^ogram away 

and starting over from scratch because ihe computer mUst have made a 

mistake. * , / . . ^ 

• ' ^ ( 
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problems for the. student that confifTB his own- incorrect "hypothesis! For 

example, one' student teacher..- was giveh the initial Example (A) (shown 

following) after which he proceeded to generate example-problems: 

A B 'C^' \ 

■19 23 ^81 . A 

+9 . +6 +8' 

TP9 "236 "HTS . 



At this point,. he concluded that the bug was "writes the bottom digit after 
the top number;" But his^hypo thesis failed when he wa» given the first 
test pfo)Jl^r|^ -. 

^ • ^- , ■ y 

to whj,ch he respo^nded 812.=, The bug actually is that single digit operands- 
are concatenat(pd on the end of the'o.ther operajid,'. so that the correct buggy 
answer is 128. By presenting only examples with fewer digits in the bottom 
number , he go^t orUy confirming evidence for his hypothesis. ^ - 

s • * > « • » 

In some^cases, an instructor may believe, his 'hypothesis s^o ^strongly 
that ^e will ^ignore disconf irmations that exist' -or decide that these 
disconf irmations are merely random noise <k One way this can be- avoided 
is by using t^e technique of differential diagnosis [Rubin 1975] in which 
one always generates at least two hypotheses and' theij chbcises test problems 
that ifeparate ^them . 

Another. important issue ;concerns the relationship between the language 
used- .to describe a student's errors and- ifs effect on- what a teacher' 
should do to_, remediate ^t. Is the language* able tof.convey to the student 
what he^ is . doing i^rong? Should we expect instructors to be able to use 
language as the tool for correcting the buggy algorithms of * students? Or 
should V onay exp^ctif instructors to be abl^ to understand what the bug is 
and attempt remediation with the student using things like manipulative 
math tools? ^The following are quotes of student teacher hypoth^eses taken 
from protocols; of BUGGY, which give a'good idea of how difficult it is to' 
express procedural idea? in English. The descriptions in parentheses are 
BUGGY'S, (prestored) explanations of t)ie bugs. , * . . - " 

\ ' ^ / - , I 

(12) There is, of course, som^ amount' of "processor failure" as s/udents 
are often all too human; 
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.^-•'Random errors in carryover." (Carries only when the next column in the top 
number i^ blank.) ' , . ^ — ; 

^ ^ ^^^^^^^ ) 

"If there are less digits on the top than on the bbttom he add^s columnfs 

diagonally." (When the top number has fewer digits th%n- the bottoifi number, 

the numbers are left- justified and then adde(J\) 

"Does not like zeto in the bottom." {?ero/from .any number is zero.^ 

"Child adds first two nimbehrs correctly l^hen when you need to carry^ in ^,.lh^; 
second, set of digits child adds numjiers carried to b6!ttom roWj tfi^en^^'dds 
third set oP digits diagonally final lyvcjarrylng over-extra digitis.-^.^^^^ 
carry is written* in the t9iprumJbet|to l^he lefjt of the column bei'ng^ ^ 
from and is mistaken. for another digi^ in t|xe top number.) ' ^ 

"Sum and carry all columns correcal^ until get'^to last column. Thert^^/tikes^ 
furthest left digit *i'n both*cokumns and^adds with digit of last carried 
amount. This is in the sum." (;When there^.arcan unequal number of digits 
. ia the two numbers^ the columns ' that'^Kave a blank^ are filled with the 
left-most *digit of that nunjber.) * - 




What doe's this say to us? .Even when one knpws >^ha1^ the bug 

terms of* being able to mimic it, how is one going to .explain it to the 

student ^rravlng problems? Considering the above examples, it is clear that 

anyone asked to solv.e a set of problems using these explanation? would noj 

doubt have real t^rouble. One can imagine a^ student's frustxi^ion when the 

teacher offers 'an explanation of why he is getting problems marked wrong, 

ahd the explanation is as confused and unclear as these, are. For that' 

matter, when the correct procedure isnjescrit)ed for the first t-ime, could. 

it tot/ be coming .across so unclearly?- i ^^^"^"^ 

^This issue is further complicated by the existence of another 

important issue: there are' fundamentally .different: ^ bugs which cau^e 

identical' behavior! In other words, there canr be several distinct bugs 

'that' are logically equivalent and always generate the same "answers". For 

example, here is a set of problems: ' • , ' 

38 186 298 ^ • 89 ' 

' • ±2Sii ±162 ±ii 
• m 2330 2357 2fll 

The underlying flaw in the student's procedure ''(his bug) can be 
described as "The columns are added 'without carries and the left-most digit 
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in the answer is the total number qt carries required in the problem." In" 
this case, .the student views the 'carries as tallies to be counted and added 
to the left of the answe^r. ^ut another equally plausible bug also exists; 
the student is placing the ^arry to the left of the next digit in the top 
number in^ead of adding' it to the digit (4.e* he is actually carrying, ten 
times, the carry digit )• This genera'tes the^ame symptoms* SD even when 
the teacher is able to d6§cribe clearly what he believes is the underlying 
bug, he may be ^addressing the Wjcqrii^ond'* • The" s^tj^dent may 'actually^ha^e' ^ 
either one of these bugs* ' - ^ _ 

We feel that all of the issues discussed above are as\ important ^ for 
students learning procedures as they are for'^ti§achers* In particular, the 
diagnostic task of a player requires - studying the structure of the 
procedural skill per se as opposed to merely performing it* This can be 
especially important if we are trying to get students not. to Just rotely 
memorize the procedural skill but to encode it in some semantically 
meaningful way* , , . 

Another reason for having students develop ^ a language for talking 
about procedures, processes, bugs, etc. is that\this language enables the 
student to talk about (and think about).- procedures and the underlying 
•^causes of mis own errors. This is important in its* own right, but it also - 
gives a studeAt the motivation and^ the apparatu's for stepping*- back and 
critiquing hisl- own thinking, as well as saying something" interesting and 
' useful about >his errors* This is especially important given the fact that 
there's been advlittle success in getting atudents to look over their own 
work^jafuch as estimating answers) and to use this perusal to. good 
advantage* * . • 
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(13) This leatfs^to an interesting^:question concerning how one can "prove" 
two different descriptions "QT^bugs^ entail logically the same surface 
manifestations* _I 
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*Aii Experiment using BUGGY 

We have conducted an- experiment to explore BUGGY 's impact on student 

teachers* In particular , .^e wished to answer the question of whether 

exposure to BUGGY significantly improves the student teachers' ability to 

t * 

'Aletect regular patterns of errorjs in simple arithmetic problems. The 
subjects were twenty-four ^jndergraduate education ii^Ji>rs from Lesley* 



College in Cambridge. They were all volunteers who were^'not paid for their 

services. The 2% subjects were divided into twelve group;S of two each. 

^. _ . * * 

• X Their exposure to BUGGX lasted approximately one and a half hours with 

most teams completing at least six different bug sessions. .Both addition 
and subtraction bugs were presenred> The first two bugs each team 
encountered were chosen from a listof simple bugs* so as , not to compound 
difficulties the subjects faced in just getting used to usin^ a computer 
terminal* and to BUGGY. [ 

The effects of their exposure to BUGGY were measured 'by comparing each 
subject's performance on pre- and post-exposure tests. There were two -such 
f^ests, labelled Red and Blue. The twenty-four subjects were" randomly 
assigned to two ^groups. One group had the Rfed test before exposure, and- 
the Blue test after, and the other group had them -in reverse order. Each 
test ^ had ten items, each item consisting of d set of ' four simple addition 
or subtraction problems with their "solutions". Seven of the^ items in each 
test contained "patterned";errors, such that ttje four . solutions*, could all 
be arrived at as a result of a single misapplied rule — -f of^exaraple , 
failure to carry when a column adds to more than 10. The other three items 
were "randomOtems in which t^^e was no single explanation for .all of the 
errors. (See Appendix 1 for ther Red test.) For the experiment, BUGGY was 
modified so that rio suBjects were giverf^ ,bugs that occurred 6h their 
post-tests. ^ 
* » 

Results ^ , ' ' . ~ • 

The^ raw data generated by tfie tests are shown in Table I.^^The items 
across the tojD ( 1P,2IS3R.". indiqate the problem number and whether the 
correct problem description was'^rand'om (if) or could be explained, by * a 
Single bug-description or pattern (P). The subjects' responses were scored 
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and ^assigned to four categories: PC, PI, PW, R%/plus one extra category o,f. 
Not Attempted (NA). The first letter stands fo/ the type of response the 
'subject made where P=pattern,. and Rrrandom. The second letter i^ -the 
quality of the explanation the subject made on that item: C^consistent or* 
completet J[the subject's .single _ex|^lanation explains all * the errors)^ 
Irinconsistent (the "sub^ject's explanation is not contradicted ^by any of the 
problems but does not explain all errors^, and Wswrong (the subject's 
explanation 4.S contraitiicted by at least one of the problems). :For the case 
of "R",, Random-Consistent is implied, * 

[insert Tatle <1] * . 

First,. let us compare the results of Pre and Post tests, combining the 



results across the . two groups of subjects and across the Red and Blue 
tests. The distributic 
values for Chi-squared. 



tests. The distribution of responses is shown in Table ,^2 together with 



, [insert Table 2] * ■ 

There was a signi?ficant improvement on^the patterned items. The number of 
correct responses for pattekis (PC) rose (p=0.0il8 by one-tailed binomial 
test).. The number of pattern ' descriptions discqnf irmed ,by one of the 
solutions it was supposed toT^describe (PW responses fell significantly 
(p=0.02 by one-tailed binomial test). The number of random (R) responses, 
where a patterned bug was incorrectly described as a random ,errOr, also 
fell by one-tailed bin^ial test).' ' , ' 

The results on the Random t^st items also showed impriotved performance 
after exposure to BUGGY, although |^y fail -to reach significance. ^ 
number of Random (R) respon^J^r r^an^lom items, increased; the. numb^'^^of 
Pat^tern^responses contradicted at least one of the examplea ^PW) 



decreased; and .the number of items not,|[ttempte4 (^NA^ fell,, sugg^king 
that speed ii;jcreased slightly. (Almost all of the reductloh^^ih th^.. ^number 
'not attempted occurred on the final random it^ms which wdlfeXJie ]^st;;ite,m 
in the^-Red test, and the next to last in the Blue test.)' \ of 
pattern-inconsistent (PIJ responses ' increased ^^^^S^^^y ' 
and random items, suggesting that the exposure , to BUG^^^^^^sed . 
subjects' sensitivity tq the presence of patterning. /wWL- . \ 

' ' ' ' 23 ' . , 
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TABLE 2 



Response 


'^S' — 

Patterned Items 

~ Pre-Test Po-st-Test 


Random Items 
.. "Pre-Test Post-Test 


PC 


55 
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n T ' 
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The foregoing conclusions depend on two assumptions'" implicit in the. 
experimental design: that the two groups of subjects were equivalent, and 
that the Red and Blue tests . were eijuiyalent. To confirm that the two 
groups of subjects were equivalent, the responses obtained in the Pre-tests 
were combined with those fr'om the"*Post-tests, for each group, as sh9jm in 
Table 3. * . 

[insert Table 3] 

The two groups yielded very similar distributions of reSponsjgs for both 
* Patterned and Random items. The differences^ are not significant by 
Chi-squared test, and a large portion of the obtained Chi-squared values 
derive from the difference in the number of Random responses between the 
two groups, which appears in both the Patterned and in the Random test 
items. 

The second assumption is that the Red , and Blue tests are equivalent. 
The Pre- and Post-test responses' are combined , separately for the Red and 
Blue tests in Table 4. 

[insert fable 4] 

'There is ho diff^rence t^jstween the twpytests'in the Random items, but 
the patterned items ^were significantly easier in the Blue tept than in the 
^Red test. Th\ numb(?r of^ correct responses was greater; for^the Blue test, 
^ and the number not attempted was smaller, though neither difference is 
significant by one-tailed binomial test. On the other hand, there^ were 
significantly more internally-rinconMstent errors (PW) on the Blue test 
*(p=*04 by two-tailed binomial). This'diffetenqe between the .Red and Blue 
tests, is unimportant as long as the pattern^^af differences is "similar for 
both the,,Pre-test and the Post-test. .Table 5 shows the^ diitributidn of 
responses to Patterned test i^ems for Red and Blue tests separately for 
Pr^-exposure and for Post-exposure applications^ (Note that different 
groups of subjects are involved, so the vali^tr of the- conclusions depends^ 
on our earlier finding of no difference between the two grojaps.) 

[insert Table 53 ' 
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TABLE 4 -. 

• • * 


' Rbspon^^ 


Psttorned Items . , . 
\ Red Blue 


Random- Items 
Red Blue 
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. - Aa inspection of Table 5 shows that .the difference between t^e two 
tests is very similar for the Pre- and Post-exposune app.lications (witlSthe 
single ^xaeption of the Random responses) and is" certainly not large enough 
to cast doubt on the main concluS;ion. We can, therefore, .conclude that 
exposure to BUGGY significantly improved ^the subjects' ability to detect 
regular patterns of errors in simple arithmetic problems. 

Qualitatfive Impressions ' . 

The next question to b'e investigated concerned the issue of what the 
subjects (student teachers) themselves felt they gained from their exposure 
to BUGGY. In order* to assess their impressions, we convened the entire 
graup during the evening when they Jiad finished using BUGGY. At that 
gathering, we first asked them to write their responses to two questions 
(discussed below) and then taped a final group discussiW in which ' we 
sought their reactionsHo* BUGGYx, and their suggestions for its deployment 
with school-aged students.^^ke f9llowin§. week, their professor held a 
similar grpup disdussion^'fhe also partieipated"in\he initial experiment) 
and reported back to us the consensus, which was consistent with what they 
had writ^ten. * . ^ 

Appendix 2 lists? all the written* responsei,s to the question "What do 
you think you learned frofi th7s^xperience?%" All responded that thisy 
came^^^away with something v«rluableSt! Many stated that they now appreciated 
the "cofijpiex and logical thought proc\sse^" that children ofterv use when 
doing an arithmetic problem incorrectly*. "It_makes me aware of problems^ 
that children have and they sometimes think logically, not carelessly as 
sometimes teachers think they do." "I never realized the many ♦different 
ways a child could- devise his own system to do a problem."* -*They also 
stated that they learned ^bett^r procedures for; discovering the underlying, 
bug — "I learned that it is necessary* to try many different types' of. 
examples tq. 'be sure th'at a child really understands. Different types of 
difficulties arise with different problems.." Several stated^ th^eir mixed, 
feelings about working with a computer. "Trying to. beat the machine can be^ 
challenging." "I learned that computer's; are a veVV complicated piece of 



machinery, . If one isn't experienced with the ^mechanism/ then 'problems 
9ould result." And finally^ "The types ^of analyses necessary to 'debug^ 
-,^tudent errors on the test (paper/ pencil) seems coore difficult: than witr? 
the 'Computer. But that doesn't make any tsense.- The 'analysis' oughts to be 
the same. Perhaps the computer motivated my analytical ability." 

Appendix 3 lists all written responses to the question "What is your 
reaction to BUGGY?" Many felt that;^"JUGGY could be used to '.sharpen a 
teacher's awareness of different diff icultife;^ ' with addition and 
subtraction." They felt that^it might be of use in grade school, high 
• school, or with special needs students, or even as a "great experience in 
beginning to play with computers. r . • 

Conclusion and Extensions 

Although our experience shows that student teachers learn a 

^significant amount from their 'use of BUGGY, the system ♦should still be' 

substantially extended . In liarticular, most of what the ^students*" learned 

While using BUGGY they learned or discovered, in some sense, on their own. 

BUGGY does no explicit ^tutoring. It simply challenges their theories and 

encourages them to articulate their thoughts. The rest of the learning 

experience occurred either through the sociology of team learning or^ from 

vhat . a persQn abstracted on his own. The .next stsp in improving the' 

educational effectiveness of BUGGY is to (1) implement an intelligent tutor 

to critique the^ example' test problems the students create J (2) point out 

interesting facets ^ of. tb'eir debugging strat^jies and (3) isolate 

mari^ifested weakAesjses in their strategies. Our experience indicates that 

siich'^ a tutor would be very helpful in^that it could' keep students from 

getting caught *irl unproductive ruts and cduld help focus* their attention on 

* 

the structure of the procedures themselves. » 

, V ' - . 

As a historical footnote, BUGGY was originally developed to -explore 
the -psychological validity of the procedural network model for complex 
prCcedQral skills. During that investigation we realised the pedagogical 
-potential of even this simple'^version of BUGGY as an instructional medium. 
More recent versions of this system have stressed instructional aspects by 
adding such features, as assigning "costs" to- student generated test cases, 
^thereby encouraging him to optimally formulate and test his hypothesis. 

■:. - •' \.i . ■ ~ ■ 
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' Along these same l^nea, the "expert" portion, of the proc/edural net 
should be made "aqticulate" in the sense of being able to/ explain and 
ju^stify the subprocedures lt^uses♦ This woul^^^ailow a student' to pose a - 
problem to the system and obtain\ a running account of the relevant' 
procedures as the "ex^rt" solves the problem* / 

Another area, for extension concerns the psychological validity, of. the 
skill decomposii^ion (and" >uggy vaViants) in ^ the proeedursil ; network^ 
Determining the proper functional breakdown af a'slcill ii\to its subskills 

is critical to the psychological validity of the modet and the resulting' 

\ ' ' ' ' ' 

behavior of the system. If the breakdown ^f the skill is not Correct^ bugs 

that people would 'consider simple may be difficult to model, ' while those 

suggested by the. model^^may. be judged "unrealistic", ' From the network 

* ^ ' ' ' ^ ■ 

designer s point of view ^this leads to the issue of choosing ^or 

,consJ,ructing one strQctural decomposition instead of anothert We ^re just 

beginning to acquire a large data base of Arithmetic errors from Stanford 

[Searle' 1976] and will be testing to see how well our diagnostic model 

accounts for ail of them. In particular, we are concerned .not. only with ' 

how many underlying bugs our current model captures , but also how' many bugs 

our/ network predicts' that never show up* A. more subtle issde concerns thev 

validity of Uve .actual %unctional decomposition - of the skills in the. 

network. Measuring the "correctness" of a particular network " is a 

^ problematic issue'as there are no clear tests of validity] but issues such 

as the ease. or "naturalness" of inclusion of newly discovered bugs and the 

appearance of combinations of bugs within a bre^ikdpwn can H)e investigated 

We are also in need of a , thfeory whibh explains wh^t makes an 

underlying bug easy or difficult to diagnose* '^^ Simple^ ^conjectures 

concerning the depth of the bug from the> surface don't seiin to work, but 

mbre sophisticated measures might. It's hard to see how/ to predict- the 

degree^ of • difficulty in diagnosing a parti6ular .bufij, without a precise 

information ^)rocessing or cognitive theory of how people actually formulate 

^conjectures ^aboutJthe underlying bug or cause of an ^^ror* 



Finally, </e^ote that we have left open the entire issue of a semantic 
or teleologicdl theory of how bugs are generated"" in the fijrst place ^' The 
need for sucn a theory is important format lea?t two Reasons* First it 
Qould provide an interesting theoretical mechanism that would ' account for 
the- entire'- collection of empirically^ arrived at bugs', anc^ second, it- 
provides the next step in a semantically based productive theory of student 



JO' 

jnodelling* 



CHAPTER 2 

AUTOMATED PROTOCOL ANALYSIS ^ A TECHNIQUE FOR MODELLING AND MEASURING 

_^ STUDENT EERFORMANCeP"^ ' • . , 

SECTION I ■ W ' • ' 

A The « persiste,at;V*theme throughput our research* has" been that for 
intelligent C^I p^pgrams to successfully tutor a student, they must be 
able to indited aV^ {model of the student 'js current . knowledge and 
p!*ef erred interaction modes. Otherwise, 'computer-based tutors, 
regardless of the power of their embedded expert, risk transactions with 
the student that are inappropriate or annoying. 

To address this student modelling problem, one liiust have some means 
for making hypotheses regarding the student's knowledge. The 
previous* chapter described such a technique-, namely diagnostic models 
^ built around procedural networks. . This chapter discusses another technique 
that augments the previ^ous one, and, unlike the previous one, assumes that 
- the main source of data available to the ICAI tutor is the student's 
pr9blem solving protocol or trace (as opposed, to Just his aniwer). This 
chapter 4)r oposfes a theory and a^Qomputational approach for automatitig the 
prptocol analysis task for the purpose of automatically^ inducing a 
structural, model of the -student's problem solving • strategies. It then 
discusses- the design of a computer system, named PAZATnV. for carrying out 
this ta^k. 

, ' In addition to providing ,Us with a powerful technique for discovering 
a studei^^V^Qderlying reasoning strategies, automated protocol analysis 



also ^fer 
With/ it 



js/a - new means of measuring and testing the tutor's 
can determ^n^ whether successive protocols reflect 



probueni'S^lying competence on the' part of the student. It 



can 



rigorous r measures of the virtues oS alternative tutor 



FinVlly, protocol analysis can also serve • as ' a diagnostic 
discovering gaps in the knowledge of a practicing problem 



(15) A substantially modified version of this chapter is appeartng^as a 
working paper by Goldstein and Miller. - . - ' . jf^ ^ 
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direct a computer based assistant's attention to those areas that require 
assistance ^and review (e.g. an adaptive-Job Performance Aid). 

In designing such an automated protocol analysis system, we have 
drawn on concepts and algorithms ' from computational ' linguistics. 
While the protocols we consider relate to problem solving behavior, and 
not linguistic interactions, we nevertheless believe that there is ' a 
fruitful synergy between the concepts. developed in the language 
understanding ar^na and the problems of ICAI. 

' Technicgl Statement of the Problem 

V 

Protocol analysis assigns one or more ''theoretical interpretations 
\ \ . \ & 

to a T:ecord of a^ subject's overt behavior on a problem solving task. 

Our concern is with problem solving .%asks in which a student or subject 

interact 3 with an on-line computer terminal.. For such tasks, the 

behavioral record ^ is the sequence vof keystrokes from the console 

session. The keystrokes are'^jrouped into events, which are ^treated as 

unitary input/output transactions. An advantage over the most general 

ataalysis* situation is gained by assuming that the dialogue occurs 

within the confines of a well-defined finite "menu" of legal responses. 

Our primary concern fs to account for p^oblem solvit behavior; we do not 

■attempt to solve the natural language understanding problem as, a 

9ubprocedure, 

For the purposes'^pf this discussion, an interpretatioqi^s a ^structural 
description of the list of events, augmented by 'an assignment of values 
to a s^t of semantic context variables, and a set of pragmatic assertions, 
associated with each node of the description. Th^, semantic 

variables and pragmatic ajSBertions relate the subgoal structure , of the 
problem solving protocol to the model, a formal description 'of the task 
to be accomplished, 'in applications, of automatic prot^ocol analysis, it 



4 



is , commoa to ^ assume the exisjten'ce of thi3 formalX problem 

description^ I^t is* not assumed that " the^ student has* inter^nally 

represented . the task in precisely the same f^^shion. These'^definitions 
are elaborated in section two. ^ • ' ^ 



. IjX order to impose realistic bounds on the specif icatiofi of the 
analyzer, it is also assumed that the protocol is "reasonable.'^ 
That is, the protocol should represent a sincere attempt to solve the 
problem at hand, and should -terminate exactly when this goal has been 
accomplished • Although ^ "reasonable" is difficult to define more" 
precisely, PAZATN^'s sensitivity to this assumption will be made clear in 
the ensuing discussion. 

Determining the Validity of Theoretical Interpretations 

The validity of the interpretations assigned^ by the analyzer may, 
be ascertained in a variety of ways* Our philosophy is to utilize every 
available source of evidence; ^ Since the synthetic problem solver 
employs identical descriptions, * its heuristic adequacy is taken as 
suggestive, though by no, means decisive, evidence* Introspection by 
human p5*ot)lem solvers is another source of weak confirming evidence,. 



The analyzer's ability to predict future behavior on the basis of 
past performaYice will provide <i''the strongest corroboration. No 
formal experimentation l:\as been carried out' to date/ Our plan is to employ 
the " finished system . for this type • of rigorously controlled 
experimentation. Ultimately we hope to embed such analyzers in 
computerized tutors* This 'is an ambitious undertaking. When a 
prototype is available, though, the pedagogical efficacy of'' that system 
will provide a further check. \ . 

Review of the Synthetic Theory 

Before examining the analyzer in-^ detail, it jjill be helpful Co 

briefly review the synthetic t^e\^ry. The basi3 for the approach is 

a hierarchical classification of commonly observed- planning and 

Tdebugging*- techniques. According to the planning theory, when the 

* * 

problem solver confronts a problem, there are three major categories o^* 

plans which may be pursued. The. probrem may . be soly^> by 

rdentificStion, that is, by recognizing it as a problem for which 'a^ 



solutiort^already exists- in ^ome, answer library^ This type 'of. plan 
may seem a bit trivial, but pf^ course it is absolutely essential to 
avoid infinite regress* . " : ' 

Alternatively, the , problem may be' • solved by 

de<?bmpositipn, that> is, by . subdividing ' it ^ into smaller, • easier 
subproblems. These are eac^solved separately (by recursively calling 
the problem solving system*), .and then necombined in one ---of several 
specific ways, to produce a solution to. the original problem* ' 
^ If these ^ strategies ^ fa^ to- produce >a'^olut^i7' the problem may 
be solved by reformulation,^ that is,, by redescrilmig the goal in .pther 
terms which seem more amenable to solution* The > reformulate<J problem 
must, of course; still be solved itself (recursively calling the 
problem solving system), by identification, decomposition, or further 
reformulation, (f . ^ 

Each" of these categories of planning concepts is fyrther subdivided 
by the theory, as illustrated by Figure 1. Identifications! may 
be accomplished by retrieval from a lexicon of primitive operations ^for. 
the task domain, or by retrieval ,from an extensible answer^ library. 
Decomposition may be performed ^y Conjundtion or by Repetition (among 
others). Reformulation may involve Equivalent models or Simplifications. 
Each of these, in turn, is elaborated still further. * 

The taxonomy- is transformed into a procedural problem solver in 
the following manner. In order to represent semantic information, a 
finite set of registers is defined. The^e are used for storing flags 
ar\d structures resulting from intermediate steps of the computation. 'At 
this point, the taxonomy can be thought of as a highly non-determini3tic 
decision tree. ' . \ 

In order to increase the system's determinism, the nodes and links 
of the tree '^re t^ken to be the states and arcs of^a recursive transition 
diagr^. Arbitrary cqndjLtions over the contents ~ of the r^gistef^s, 
are associated with the arcs, as preconditions for following them. 
Finally, arbitrary structure-building and register-settling actions are 
associated with the arcs, to be performed ><henever they are followed. 



Figure 1 . The Planning Taxonomy 
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. For efficiencyV some states with similar topology are merged, and 
a few additional arcs are added to provi4^ f6r such features as iterative 
control, when recursively- Ihvoking the complete problem solver • is 
unnecessary*' Although we allo\arbitrary conditions and actions, these 
are not chpsen arbitrarily, but arfe^carefully selected to ' reflect the 
.semantics arid pragmatics of the problem solving process* 

The result of this metamorphosis ' ia VaTN's synthetic aujpnented 
transition network displayed in Figure 2."''^ ^\ ' 

PATH has a particularly interesting property frbm the standpoint of 
,pro\:Q^ol analysis. It views certain types of errors Cbugs) as rational, 

in that' they result from heuristically sound planning choices made in 

""^^ 

the absence 'of complete information, and ,is capable oT producing 
partial solutions (i.e., traversing, intermediate st'ates) containing 
bugs of this type. 

Design Considerations 

A major insight of generative grammarians (e.g.^ Chomsky [I965]) 
was that in characterizing a set of phenomena, it is often helpful to 
conceptualize the formalism synthetically, and to view analysis as a 
process of inverting synthetic rules, Equivalently , arjalysis may be 
described as the selection of one or more plausible derivations from 
a potentially infinite collection of syntjj^tic possibilities. .In 
designing PAZATN, , we have found it enU«hterii-ng to view protocol 
analysis as parsing in this sense, where PATH is taken , as ^ the generative 
formalism. , , ^ ' 

Since the space .of synthetic possibilities (both in language 
processing and in problem solving) > is potentially infinite, it is 
critical that this space be characterized using a finite (reasonably small) 



(16) PATH is aY) expert problem solving' system, designed by Miller and 
Goldstein [1976] in .which planning knowledige is modeled using augmented 
transition networks [Woods 1970]. This system serves as the cornerstone of 
a grammatical theory of problem solving which can act as ^ a formalism fdr 
representing the knowledge of our Articulatie Expert for mathematics and 
some^aspects of electronics. t'*^ 
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Figure 2. Planning ATN for Symbolic Integration 
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set of rihles*- In PATN, these riales take the form of an -ATN- This is 
somewhat 1 unusual, since in computational linguistics the ATN is 
comtaonly >thought of^ as an . efficient iqechanism ^or inverting 
transformational rules, i»e», for analysis. PATN's synthetic ATN is a 
generator for the space of plans and ^ debugging ' techniques which are 
relevant > to t^e^oblem at hand* 

Naturally, PAZftJN is not prepared to understand protocols which PATN 
could not be made to generate eventually. . The one exception to this is 
that buggy versions oT various synthetic ^ -plans (including irrational" 
bugs .which would not be introduced by PATN) can of ten be- recognized^ 
Since PATN is presumably, an effective proc'edure within its domain of 
competence, the analysis could, in principle, be performed by exhaustively 
enumerating the set of synthetic protocols, and ^electing the .first one 
which matches the input data. Unfortunately, this would take 
considerable time. Consequent ly^ the primary consideration in the 

r 

analyzer's design must be to ensure that this synthetic plan space is 
searched efficiently. Bottom up ' evidence from the actual protocol is 
<«sed for this purpdie. 

An important design consideration is that the analyzer be able to 
take full advantage of the available sources of constraint. , The 
protocol analyzer has access to an unusually^" strong set of 
expectatiiDns, namely the model. This is analogous . to knowing the 
"gist" of what a speaker is sgoing to say before parsing it. 
Consequently,^ the analyzer must be organized in isuch a manner that it^ is, 
able to' extensively utilize the top down synthetid guidance which can be 
provided by PATN. ' * ^ 

This mi^ht suggest a design based 'on -using PATN as a purSly top 
down predictive analyzer. The difficulty is that, while we know the 
"gist" of the input, there is a- Jtremendoias diversity of potential 
realizations of a given model in term^of the form of the solution. So it 
i& mdre like knowing the "theme" of a story,, but not J^owing whether the**' 
author will present the events in chronological order, via flashbacks*. 



or in an order der^ived' from some other organizing .principle. The 
unguided PATN cduld ^ generate scores of Irrelevant synthetic solutions 
before stumbling upon one that matched the data. This factor, leads- to a 
somewhat elaborate dual organization for the analyzer, which er\ables 
it^ to reduce the diversity .by considering bottom up evidence ps well. 

Another difficulty which must be faced, if PAZATN style analyzers 
aije to be^ viable for eventual dynamic use in computerized 
tutoring, ^ is that events must be ^ examined in a, single pass, in 
approximately/ left to right order. One 'could postpone- this issue 
temporariJ.y, \ but such a . simplification might result in a design which 
could not be extended for applications because of fundamental, prematurex 
conunitments» If the analyser is 'for;ced to back up frequently, over many 
events, it is often likely to find itself "apologizing" fof 
• inappropriate, tutorial remarks regarding prior events. Consequently 
'rtust carry ^long ani plausiiae alternative iftterpretations in parallel, 
until it ha§ a clear basis for ruling, them diit. Converse/y, the ana^lyzer- 
^ust have some capability for resiriating the set of alternatives under 
active consideration, to ensure that- excessive processing and storage 
resources are not consumed by low pl^ttsibility interpretations. °. 

The or^nijation of the" protocol analyzer • is a 
geneijalization and elaboration of the coroutine search plan-finding 
procedure used by' Mycroft .[Goldstein 197^, 1975].' The differences' 
arise mainly from the ne6d Co take, account of the consideration^ 
mentioned above. In particular, the protocol pnalyzer , is intended 
to: (a)' apply to qioreNjthan a single task domain; (b) understand a 
Wider range ^'of ev^ types (e.g., Mycroft was designed to analyze 
finished . compu^ter programs rather than protocols); (c) reap maximum 
advantage from the dynamic informalion cavailable ds^the protoc61 regar^ling 
subgoal structure and development; ;4nd (d) embody the nlore coherent 
structured planning ^and ^ebugfeing theory underlying PATN. ^^f, * 



Overview ~ . , , v 

^ . The RAZATIJ protocol analyzer is constructed on PATN's synthftic 

, foundat,ibna by snpplejje;iting ' the synthetic ATN with a number of 
additional modules and' data^ structures. One data structure is used' to 
keep track of the set. of plausible subgoals which have beerj' proposed by 
. PATN. Another . i^' used to record the state of -partially completed 
interpretations of the- protocol. A preprocessor module is used to 
suppress uninteresting syntactic details and to perform preliminary 
segmentation;' The\ preprocessor employs, an* event classifier to 
determine the syntadtic class -of . each event 'of the protocol. 

^Corresponding to each ^^syntactio category, PA^TN must be supplied with 

^an event specialist whicii ;^ embodies ' the 'requisite domain knowledge for 
assisting an event interpreter in associating an event of that type with 
some synthetic subgoal. • Since a purely top dpwn or bottom up strategy 
would be teS^efficient, a scheduler module is necessary to direct the 
analy26r ' through a "trest first" coroutine search. ' 

Section two elaborates our notion of protocol analysis as a parsing 
process analogous to the natural language processing .task. The third 

. seption • provides a slightly simplified descript'iofi of . the 
organization of the automatic protocol analyzer*. Section four refines 

r 

this order description of- PAZ ATN' s design. Finally, we pr^esent 

our tentative conclusions and plans for future work. 
SECTION II i - ' 

A GRAMMATICAL APPROACH. TO PROTOCOL ANALYSIS 

This sectioif addr»esse's the question: ".What is it. about - PAZ A^TN's 
approach to , protocol analysis that ■ makes it grammatical?" 
Central to the approach is the conjecture that various aspects ' of 
problem-, solving:, J)ehavior can be studied approximately independeptly. 
Consider, the underlying problem solver (i.e., '. the subject) whose 
behavior is to jbe-^ analyzed. While we^conqeive of this problem^ solver 
. as- being. an integrated procedural sjrstem, we nevertheless . suppose, at 
•least as a research strategy, that certain aspects may be factored out 



for separate study: the ; structural ^ component,^ ' the semantic 
component, and the pragmatic^ ^ component. These correspond, 
respectively, to the potential control paths, data flow,' a;id branching 
cotKUtions • of a procedural, problem solder. These aspects are 
modelled by the network of states jand arcs,, the registers, and the 
transition 'conditions of the augmented transition ^ network. The 
next sub-$ection introduces an example protocol in ordet* to illustrate 
PAZATN's analysis. V ^' - . 



An Example Problem Solving Protocol 

In this, sub-section ^we provide , a brief example of the type' of 
problem solving protocol which PAZATN is to^ analyze, and the sort of 
analysis ' which it would provide. Imagine a situation in* which a student 
(S) is interacting with a computerized educational environment such 
as SOPHIE. Suppose S is confronted with the the following problem: 

In an electrical circuit, the voltage at time "t" is given 
by 

e(t) = r.sin(wt) , v * \ 

where r and w are arbitrary constants. FinH the root- 
mean-square voltage for tjhe time interva)l«[a,b]» 

V i ' ' ' . _y ' 

A segment from .a^ hypothetical protocol, representing S's^solutiop path on 
this problem^ is shown in. Figure 3. Before delving into the details of 
PiftATN's analysis, we provide an informal^ account of the student's 
solution. 

The student was f^miliar with the definition of root- mean-square 
voltage, and hence begari the protocol by writing down the relevant formula. 




EOl: ' V^- =W— ^ / ^ [e(t)]2 dt 
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Figure 3 The Example Protocol Segment 



EOl 



[a.b-] 



rms 



b - a 



[e(t)f dt 



€02^ ' 



b - a 



[r^sin^(wt)] dt 



E03: 



•' r^siit^(wt) dt 



E04; 



r^ I sin^(wt) dt 



E05; 



sin^(t) dt 



. E06: 



2 



E07: 



EOS: 




sin^(t). 
. 3 . 
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Figure 3 The Example Protocol 



E09^ 



sin (t)cos(t) 



/ 



,E10: • 



2 

u &du 



cos(t) 



Ell 



£12 



cos 



(t) = - sin2(t) 



|u2[(l-^2).l/2 3 



du 



E13: 



siiti^d) dt 



E14: 



E15: 



/ 



let y = sin(t), dv = sin(t) dt 
du = cos(t) dt, ' V = -cos(^) 



E16: 



sin^(t) dt = -siri(t)cos(t) + //cos^(t)"dt 



E17: ; - . : 



cos^{t) dt = 1 T d| - /si/(t) dt 



ERIC 



E18: 



/ sin^(t) dt =^t - sin(t)cos(t) 
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, Next, " S substituted the particular definition for e(t) provided by the 
current problem statement. , , >^ 



This Resulted . in a problem whose essence is integrating the fCfnction 

2 • 
sin . Some^ students might have remembered the formula for this indefinite 

integral, ' in which case the solution would have been straightforward . 

In this case, S knew only a^few simple in^yals- and a few basic rules 

for decomposing, complex integrals into ;5E^^e?ones. In the next step S 

focused oft this integration:! task. ^ 



I 



ECB: . = / r^sin^(wt) dt 



Then S applied the "sum of integrands" rule, Eliminating 'the r^ term.. 



E04: = r^J sin^(wt) dt * " , v 

J^as^a Simplification, S decided to ignore the ''vrTerm in the argument 
to the sin function. 



E05r^ = I sin^(t) dj 



I 



At this point, S attempted to apply the substitution, u = sin(t)', hoping to 

•convert the integrand to a polynomial, one of the primitive integrals 

which was' knovm. However, the student comm'it^ted the common error of 

failing to substitute for the differential term. ' • 



% ♦ 
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In a sense, the bug was fortuitous', since it converted the integrand 
■to a simple polynom^l* 



3 

^ U 

-E07: ^ = 



The final step of S's substitution plan was " to re-suBstitute for the 

r 

temporary variables, restoring the solution to include only those terms 



which were mentioned in the ")original problem statement. 

sin^(t) 




EOS: „ ■ = r— ^ " , ' 

3 

At this point, S became suspicious of the substitution the r^ult 
seemed too simple. As a check oh its validity, S differentiated 
the expression. ' : ' ' " ^ 



E09: ^ sin^Ct)cos(t) 




Here, • S realized the mistake in , E06, anil— ^ re-executed the 
Substitution ♦ ^ This time ' S corrcTc^ly >>8tibstiluted .for the 
differential term, except that the expression used was still in terms of 
t, not'u. ^ 



^ ,( 



u du ' fit 

— ' ■ _ -m 

cos(t) - , . . - ' ••■>> 

■ \ ^ 



The appropriate next step is to- rid the expression of t. S 
acpomplished this using the Pythagorean relation. 



Ell:.. . ' ^ cos(t) =^fr^~s^i^^ 



r 

Actually, at^ E12, S' has derived the- canonical u = sin(t) 

substitutior\. formula. .However, the resulting' subprdblem was also 

unfamili^ar. It ^ did not appear to .S to {)e sufficiently simpler than \he 

^orj^ginal propldff^ ' ^ 

The substitution plan therefor^ failed to .produce the desir'ed 
result. Henee, S retreated -to the sin^(t) formulation, aAd tried a. new 

approach - integration- by^ parts. " < > 



A' 



E13: f sin^(t) dt 



- 1 



E14; 



let U = stn{t)„ dv = si.n(t-) dt 



^Br- . : du, = cos(t) dt, v = -cos{t)* 



.L2 



E16: sjn^it) dt = -sin(4)cos(t) + / cos^(t) dt 
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Integration by . parts/resulted i'n what appears; at first, to be an^^qually 
hard problem - ini^rating cos^(t). , , " ' 
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E17: • ^ J cos^(t) dt = /l dt - /sin^'(t) dt 



i 



But once agalg., the student applied the Pythagorean ' relation, thi3 
time leading, to 'an equation which did allow solving for the desired 
integral • . ^ , 

E18: - . ^ I sin^(t) dt = t - sin(t)cos(t) ' * 

J" V ■ ' 

Event. 18 still does not represent a complete, solution to the original 

problem* . S might still h.ave forgotten, for example, to correct for the 

«• 

simplification introduced at event E05, or might have incorrectly 
evaluated' the limit ,terms\^for the definite form of thV integral. 
However, thi3 segment of the protocol is* suf f i,bient to serve as our. 
example 5f the form of PAZATH/s analysis* • • , ' ' ' 

Structural Descriptions * ^ • ^ 

The result of PAZATN's protocol analysis is a set of- data structures- 
E^presenting these several aspects of the problem solving behavior. 
The 'first is a description of the su'bgoal structure of ^the protocol. 
This data structure is similar to the conte^xt free deep structures (or 
base components; .of -.natural language parsing. "It summarizTes the arc 

^transitions which presumably were followed by the generating ATM. The set 
of lil^al structural descriptions may be characterized by^ a context* free 
grammar.'' ^ 1*o apply PAZATN to a vjide range ot .protocols,- a thorough 

.analysis of the specialized problem-decomposition t^hniques relevant^ 
to the particular domain is necessary. , The reduced grammar illustrated 
in Figure ^ is adequate for analyzing the subgoal structure of the 
segment' of protocol introduced above. While this grammar is. typical of 
the sort we envision, by no means does it represent a complete task 
analysis. ^ ^ \' ^ 
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• Figure 4., The Content Free/.Grammar 
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Figure 5' indicates' the struct ural description of this protocol 
which PAZATN is^ intpded to produce. Such/^structural descriptions 
capture one aspect of problem solvijig behavior.. They can be used to 
provide formal answers to certain questions which heretofore mighjt have 
been discussed only -in a more 'intuitive way. As an example, the 
parse tree makes it apparent, by inspection, i^that the student is 
comfortable with integration by parts; however, the incorrect first 
attempt to use substitution, and the subsequent failure to apply it .on a 
second, appropriate occasion (at E12), provide evidence that this 
student requires additional practice u3ing substitutions. 

Semantics and Pragmatics * 

Although the sort of description discussed in the.; previous 
section is useful for answering certain questions^ iCdoes not tell 
the whole sto.ry* Even to make such structural descriptions intelligible 
to ^ the reader, it is necessary to provide some semantic and 
pragmatic .commentary. • The synthetic ^theor^;of planning and debugging, 
provides the basi? for more complete and precise semantic and pragmatic 
annotation. " . - 

Semantic annotation is defined to be the values of the ATN 
registers 'associated with e^ch node of the slructural description.. 
These relate the • bet^vior to .the formal problem description. Pragmatic 
annotation is defined to be a record of the justifications for selecting 
a given arc transition rather than ite competitors.- In* analysis, this 
pragmatic annotation is a hypothesis about the subject's reasons for 
using a particular^^approach.' ^ T|iese hypotheses are based on boVh 
PATN's arc conditions (when the recommended synthetic transitions *have 
been made) and heuristic inferences from the available data. 

• The following is a typical set of registers which would be em^ployed 
by PATN to define the semantic' context of a node in the problem solving 
tree. ,_Some of these are not "primitive, « since they are derivable' from 
one ./r more.: of - the others. - It is possible.^ that additional 
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Figure 5. Structural Description of the Example Protocol 
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semantic variables may be added in future research, perhaps in tailoring 
PATN to particular domains. The list below is adequatjs for our current 
purposes. 

i . ?TREE is that part -of the parse tree attached 
to the current node ("below" it). ^ 
• - 2. 7PR0CEDURE is the terminal solution procedure ^ 

as defined* so far. This reflects t^e state of the plan 



V 



? ^ after any debugging events have been taken into account, 
3. ?EFFECT is a domain-oriented description of 
the actual performance obtainable by the solution as 
defined so far. Sinqe a partially solved problem may 
contain references to currently unsolved subgoals, 
7EFFECT may be unassigned at a given node. 

!>. 7PR0T0C0L is\he "fringe" of 7TREE. That is, 
it is'^the list of terminal^ events dominated by a given 



node. 

5* 7PLAN is a collapsed version of the subtree 
associated -with 7PR0CEDURE.— * That is, 7PLAN corresponds 
to the notion gf the plan pf a finished solution'.. The 
concept of collapsing, a parsed protocol into a plan is 
elaborated in other reports by the authors. 

6. ^MODEL is the^ set of predicates which " - 
7PR0CEDURE, is intended to accomplish. For a correct 
^solution 7EFFBCT will^be a special case 7m6del1 

7v?ADVICE is a list of' planning and debugging 
suggestions ..generated by the synthetic pragmatics erf ^ ' 
PATN. For example,, in solving a "^ttgivel integral by ^ 
partial fractions, when it is not known for certain 
whether such a decomposition is valid, a record of the 
fact that the partial fractions aVc transition may have 
been inappropriate, is appended to the current contents of 
7ADVICE. This helps to guide the debugging component in 
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diagnosing the underlying cause of later model 
violations. ' / 

8. 7T1TLE is* the symbolic name x>f the solution 
currently"^ being developed. This aids in the det>agtion of 
self-referential (recursive) plains. An ^example of its use" 
in the example^ protocol , is when the. integration-by- 
parts led to a second occurrence of the integral qf sin^. 
Sometimes, as it happened here, a self-reference results 
in a so lut ion ; at other t imfes , ' it may* indicate a^ 
circularity in the solution path. 

%. ?GIVENS is a list of the names and types of 
the given data, and assumptions which may be made 
regarding them by the subplan below a given node. This 
is used, for instance, in the detection of 
inconsistencies between the djdfinitions of subgoals arvd 
their usage.. " - 



10,. 7VI0LATI0NS i% the list of model predicates 
Which are not satisfiecT by nhe 7EFFECT achieved by 

?PROCEpURE., This^ register 'is set by a separate " 

performance annotation module. ^ 

Let us briefly consider a few' examples of ^the values of these 
registers at various nodes^ of the structural descriptions for the^ 
hypothetical problem solving protocol presented earlier. For the SOLVE 
Qpde corresponding to EOSi TMODEL is as shown in Figure 6. 

Prior^ to E09, the '^:?yiOLllTIONS register at the PLAN node for the 
substitution wasr 



(NOT (= '(EXPR E05) (EXPR E06))) 



Since the integration task is eventually solved, ?VIOLATIONS^»^ is ^^mp€^ at 
its* SOLVE node, since^so lut ions include debugging. The same is not true 
for the corresponding PLAN node. 
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Figure 6. Problem Description (Model) for Top Level Int€gral 



3{f5(t)) such that 



d/(t) 
dt . 



2 * 2 
= r sin (wt); 



and 
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The pragma|ics provides rationales for the various planning 

choices in the f protocols, : These are derived from the synthetic 'arc 

conditions when /^applicable. For example, the reatson for integration by 
parts being attempted on the integration task was that the integrand wa^ in 

the form of a product of two teriS¥. /' > . 

(REASON (INXrBY-PARTS E13) , • 

. ' , (EQ (FORM (INT5GRAND E05)) 'PRODUCT")) \ * 



30|^ 



The . reaso|^p6r each buggy event in, the protocol is the same as the 

reason for what might have been i^e corresponding correct version of the 

.^vejpft, but flagged by a note stating that the attempt was buggy. 

\Debugfeing operations localize, (or repair) the cause ^of some 

violation. The reason for E09, for example, is to verify that the 

integration satisfied its specifications (i.e., that the derivative of 

» 

the results give the original expression). In this case, the underlying 
cause of the vioTat'ion was the omission of an essentj.al cleanup step 
(the differential term). The repair was to solve for the missing term, and 
incorporate it art\the appropriate point in the solution: ^ ^- 

^ _^ • 

(REASON E10 (REP'AIR E06)) 

REASONS are represented by assertions involving instantiated arc 
predicates of this sort^ attached to" each node ^f the.^strvictural 
description. ' ^ ^ ■ ^ ' 

V- . • / 

Discussion 

The examlple protocol discussed in this, section illustrates 
the analyses which PAZATN is designed to generate. In keeping with 
the grammatical metaphor, these analyses have three aspects: structural 
(syntax),, semantic; (purposes), and pragmatic (reasons). The structural 
analysis is represented as a parse' tree» The semantic and 
pragmatic informatit>n is represented as annotation (variables and 
assertion s) a ssociated with each,ftode of the parse tree J 



' j ^ 

Some readers might object that these • three ^aspects alone do not 
constitute a compl'ete analysis of a protocol. Perhaps some " essential 

* / 

dimervsion of the subject's problem solving performance has been 
overlooked. If there are useful questions about the behavior which are 
not captured by these aspects, we would have to agree. However, our 
working hypothesis is that t^re. are not. Hence, we believe that part 
of our contributic^^in 4his research is qur recpghition -of the 
-appropriateness of a linguistic analogy. 

A precise definition of protocol analysis * has been provided, 
along with a brief example of the form of this analysis. We now turn our 
attention to the design of PAZATN, a scheme for performing such analyses 
automatically* 

SECTION III ' • ' 

ORGANIZATION OF THE PAZATN PROTOCOL ANALYZER 

General 

ft- < . « ' 

, * In this sub-section we describe the general organization of the 

protocol analyzer. Later . sub-sections present additional detail. The 
analyzer would consist of \the following data structures and modules: 
PATN,^the PLANCHART, the DATACriART, the^,^rocessor, 'the event classifier; 
the (domain specific) event specialists, 'the event -interpreter and the 
scheduler. Figure 7 provides a block diagram. After reviewing the 
analyzer's input/output- specifications, We* consider each of these 
components iYi turn. Section , four ' refines the . fi?st order 
description provided in the current section. Since the event specialists 
are domain specific, we will not provide details in this report. 

The analyzer receives the model , as input. It. is/ a formal - 
statement of the top level ^oal, and the. jJrotocol , wlfidh is a list of 
input/output event's. It " h#s been , assumed that, th^ protocol is 
"reason^e , •» in that it represents a sincere attempt to accomplish the 
task, and- that it terminates exactly 'when this goal' haif been satisfied. 
The design is robust in this respect: it relies only slightly on 
these simplifying assumptions. Consequently, it is^ouc expectation that 
' -^^ / ■ . 58 ' 66 ' ' 
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the analyzer will dlao prove lo bn useful (although, it may 'perform less 
efficiently) ^Vor less than ideai protocols, such as where the 
subject/stude^nt makes a sensible start but fails to complete the project • 
The outpuT of the analyzer is a set of»one or more plausible 
^interpr^etations- of the protocol, where. an interpretation- is 
defined as the assignment of a structural description (or "parse**) to 
the list of events, augmented by an assignment of, values to the set of 
semantic variables, as well as by a collection * 'of *^pr^matic-reason 
assertions, for^ each node of the description^ In order .to discuss 
the representation of interpretations, and th^ manner in which they ,are 
discovered, it is necessary to introduce the roles of the ATN and- PL^IiCHART 
in the analysis process. 

Augmented Transition Network (ATN) 

To understand the central role of theT^TN, on« need "only remember 
that the analyzer is little more than a procedure for ^^lecti,ng those 
synthetic solutions to the stated problem which most closely match' fche 
input data. However, t^e space of possible solution paths ^is 
represented intens^ionally (as opposed to ext^nsionallyj by the ATN. We 
require the ATN to generate complete protocols, even lb the 'level of 
events corresponding to the typing in of detailed • ^nstnjuicfeions to the 
computer monitor. Some of these requirements are sui)erfl'uous *for the 
expert version of ^the problem solving system; Hence, we plan "to- 
employ a slightly modified version of PATN in the analyzers J (but the 
differences are hot otherwise important). • ~' 

There is a questlron as to whether the expert version of ti^e ATN will 
eventually succeed in spanning the entire space of reasonable niwi-expert 
behaviors, provided that each of its preferred at)PTroaches is 
successively rejected by the analyzer. The expert version or PATN- would 
have the interesting property of being capable ^of producing partial 
solutions which contain certain "rational bugs." Furthermore, it will 
be <, seen that the spanning requinement does not^ rule 'ou\ the 



analysis of. "ine^cplicable"- (or "irrational") 'bugs — su<5h 



as 



typographical errop^ or memoriT lapses — provided that the/ can be 
H:recognized as deviant versions of * some rational synthetic 
behavior. Consequently, ^e tentatively assume .that PATN is indeed such 
a spanning model in this extended sense. 

The "ATN would perform arc transitions partially as a result of 
^PATN's synthetic pragmatics and partially as a result of analytic 
guidance. For example, the ATN may expand the plan -for a subgoal which 
might not hive bfeen pursued in the pure synthi^ti« system; because 
analytic criteria have established that tfris is probably a subgoal- of the 
subject^'student. jK^^N then suggests Tit)w one might go about solving it.^ 

■ a ■ ■ . ' 

The PLANCHART ^ 

As the analysis progresses, there are a number of reasons for needing 
an extensional ^^presention of the ATN process, as it operates upon the 
particular problem. * Consequently, a complete trace of the synthetic 
computation is, kept, for examination by the analyzer. This data structure 
is^ celled the PLANCHART. The most obvious reason for creating such a 
rjepr^sentation is to avoid repeated calculations; but « important 



PLANCHART will appear in the' course of the 

* , - ' 

*^d;i^Sus^ion. 

In fact, the PLANCHAR'T includes not only, plans, but "nodes of other 

* - ^ • t, > , * 

types such as debugging episodes. ' As" its^jiame suggests, th\ PLANCHART 

is a chart [Kay 1973], a network-like data structure which 

compactly represents many^'^^Bombinations^, of subexpressions. This 

data structure is 'an efficient representation for j^ATif's current 

set ot partial solutions and their structural descriptions. Rather than 

generating the entire solution space at once, which would ,be "impractical 

even ^ if the space happened to be finite, the ATN expands this PLANCHART 

incremejntally as additional possibilities are needed by the analyzer. 

The * PLANCHART resembles an " .^^-AND/OR goal tree (see Flfure 8, 

for an ej^ample). However, there are a greater variety of node types. 
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Figure 8. Example Planchart: Like an AND/OR Goal Tree 
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rather than just AND and OR.-. .ThijS allows' the . PLANCHART to represent 
sucji ^ concepts a3 whether a set of conjuncts, -need to be 
accomplished in the specified order, or whether -^py order' ' hHT do, 
allowing a greater variety of synthetic, combinations to be 
expressed parsimoniously. For concreteness, we take the PLANCHART to be 
a LISP S-e3fpression, However, each subexpression is unique-ized; that 
is, EQUAL subgoals refer to physically identical structures. The reasons 
forjthis are explained shortly, ' ; ' ' ^ 

The analysis 'process .is closely tied. to modifications of this data 
structure. In particular, the structural description assigned to a 
protocol corresponds to a subtree of Uie PLANCHART starting from the .root, 
(the top level SOLVE node) to the , individual protocol events 
corresponding to a subset of the leave^. Consequently the structure 
building actions of the analysis system are performed entirely by the 
ATN. 

^The jteoresentatlon of Interpretations 

In view of the above remarks, it should be clear that an 
j.nterpretat,ion of an event can be defined' simply as an assignment of tha.t ' 
event to^ a leaf of the PLANCHART ' (Figure „ 9). . Similarly; ari^ 
interpretation of the protocol corresponds to a complete association list 
of such event assignments, and a partial interpretation /is an association 
list containing assignments * for a . subset of the events in' the 
complete protocols As a consequencjs of the left-to-right processing . 
order, a typical partial interpretation contains assignments for the^ 
first M out of N events ♦ 

Notice, though, that a given" PLANCHAi\T leaf may be a member of 
more than one structur^ description, due to the structure sharing, 
mentioned earlier ♦ This is an^ advantage! tJenuine ambiguities need 
not be treated as explicit alternatives* The analyzer does ^not, commit 
itself. -to an arbitrary decision/ * All possibilities are carried '"along,* 
implicitly, at no extra post* It .is possibjei but unlikely, that 
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Figthre 9. 'interpreting -Events 'by Assfgnment to PLANCHART Leaves 
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the completer '^association list for th^ entire protocol, will likewise 
Have multiple structural description pathways/ * through the PLANCHART* 
Bach of these, technically, x should be considered a different 
interpretaCion* Nevertheless, it is sensible to, lump them together", 
since this situation can only occur when the data have been rnadequffte to'' 
distinguish them* 

In order ^to ^e assigned to a* given leaf of the PLANCHART, it is not 
necessary for the data event to identically match the corresponding 
synthetic" event. Th^^ assignment merel-y reflects the heuristic 
judgment of the analyzer that the actual data ^vent was intended to 
serve the same rol^^as the associated synthetic event. Consequently 
.a synthetic event .(i.e. a single PLANg>^ART leaf ) actually stands for an 
equivalence class of data events, with various plausibilities. 

For ^n interpretation to be plausible, the data event must be very 
"siraila>il,r/to^ the assigned synthetic event. There are exactly two wa-ys 
in which the events may differ: Ca) the data event is an alternative, 
equivalent realization of the synthetic even^^ or (b) the data event 
^is a^ "buggy" realization of .the-synthetic event. The plausibility of 
assignments of: type '(b) depends on three factors. One^ factor^ is the 
intrinsic^,' essentially syntactic, similarity. Misspellings which differ 
by only one or two characters are an example. The second factor is* 
knowledge of common *bug types. ' Since "rational" bugs would appear as 
distinct leaves of the^ PLANCHART, * here <we .speak of the "irrational" 
variety. Since there is,.* at present, no compelling theory to 
account for such' bugs, the Evidence must be of a statistical nature(|||| and 
not necessarily the same^ ^or each individual.' The third factor is the 
context in Which the bug occurs* . This is determine'd by. the status of 
neighboring leaves. We return to these questions later./ . ' 

' [ ^ \ " • e . 

The DATACHART . ^ ' . . 

A partial interpretation is . saio^ to split *when rt 'proposes ' 
more -than a single PLANCHART assignment^ for its next event. Some 



raedwd for keeping track of the analyzer's alternative partial 

inifllpre Unions' • 'is^ nfe^^T^ Ideally,, it should take advantage of , the 

fact that, following a split, the event interpretations prior to that- 

split remain the same: the common ancestry should be preserved. The 

» 

dAtACHART serves this function-. ^ 

The DATACHART may be thought of as a context-layered data base, such 
as that provided by CONNIVER-[Sussian 4 McDermott 19'72]^ PAZATN would 
record partial interpretations in CONNIVER-like contexts* Suppose that 
two interpretations have identical assignments for .the first M events, 
and then split. The split, corresponds to a single context layer 
having two descendants. Assertions corresponding to the shared part of 
the interpretatiorv - are ^automatically inherited from the parent context 
layer (Figure 10). / , -.^ ^ 

Whenever an event assignment , is to be made whose plausibility 
does not exceed some threshold, the following actions are performed: 

(1) An assertion is j^dded^to the current context, 
indicating which assignment is , about' to be made. This : 

" ensures that the . same possibilities wijl not be . ^ 

♦ 

repeatedly pursued. 

(2) A PUSHCONTEXT is executed, creating a new 
subcontext which will inherit prior assignments from the 

^ parent context. This^ensjures that cha.nges which reflect 
the 'uncertain continuation of the interpretation will not 
affect the state information in the parent. 

(3) The uncertain assigiiment is^performed in the 
new subcontext. The normal operations associated with 

. event interpretation (described ^elow) are carried out. , 
(^)" A handle to this- context is placed on a list> 
of NEW partial interpretations. This ensures that it 
^ will be scheduled ' for at least one cycle of further- 

invest igat ion ♦ ' 'n 
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Figure 10... Inheritance of Shared Partial Interpretations 



(ASSIGNMENT EOl ptr,') 
(ASSIGNMENT E02 ptr;)' • 



(ASSIGNMENT E03 



(ASSIGNMENT £06 ptrg) 



(ASSIGNMENT E03 ptr^) 



(ASSIGNMENT E04 ptr3) 




(ASSIGNMENT EOAfptr.) 



(ASSIGNMENT £05 ptrg) 
.(ASSIGNMENT £06 ptr^) 



(ASSIGNMENT £05 ptr^) 



(ASS.IGNMENT £06 ptr^) 



(state saved, but ho ' 
actual splitting here) 



(ASSIGNMENT £07 ptrg) 



1 \ i 

r 



> 



67 



76 



(5) A POPCONTEXT is executed. The parent context ^ 
. of the new interpretation is th^en re-examined to 

determine if alternative assignments should also ^be 

considered, * If so, the above sequence of operations is 
^carried out for each. Wben no further alternatives seem 

worth consider^ing at the present time, the parent context 

is placed on a list of HUNG interpretations. 

With this technique, it] is not necessary to explicitly list% alL 
of the possible a'lternative interpretations for a given event. Note 
that, after the PUSHCONTEXT, the HUNG layer represents, not a single 
partial interpretation, but an- indefinite number of implicit alternatives, 
to the partial interpretations explicitly represented by its 
offspring. Even after • it is HUNG, the parent context contains the 
necessary state information for generating addit'io^ial possibilities, 
should it ever need to be reactivated. . 

Incremental PLANCH ART^ Expansion 

Consider the situation in which an * act^^ partial 
interpretation can f ihdT no acceptable assignment for * its next evept in 
the PLANCHART. There are two actions possible: 6'ither (a) -conclude that 
the current partial, interphetat^op is S dead end, and iflo^e it to thej 
HUNG -list;- or ^Cb) conclude that the PLANCHART has not been expanded 
sufficiently to account for the current data. 

In case (b), the analyzer passes control to PATN,* which expands 
those subgoals most likely to be relevant to ithis interpretation. 



ERIC 



Since the PLANCHART is'" kept ^n the GLOBAL context, other 
interpretations may also benefit from the 'additional - growth. This 
is the -only sittiation in which the PLANCHART is expanded • (This rule 
is modified slightly in the next, section.) Limited /\ incremental 
grdwth ensures that a minimum number of irrelevant synthet|.c solutions 
are generated* ' • i 
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Unfortunately, deciding whether (a) or ^(b) is actually the case, 

may be ■ difficult. The difficulty iS'-c^ompounded by the fact that a given 

^ data event need not be an exact match to a PLANCHART leaf in order ■ to *be 

assigned to it; it could be a buggy version, or ^ an equivalent 

construct, Theref are^ three technical problems: (1) choosing between 
> 

cases (a) and (b) above for a given leaf; (2) locating the relevant^ 
existing leaves which ought to be considered^ in view of possible 
equivalence and bugginess; and (3) locating the .relevant existing 
partial interpretations which might be able to "make use" of newly 
generated PLANCHART leaves, especially in view of., possible 
equivalence and bugginess. ' , • . 

^ Now, if the analyzer is too mistrly in allowir*^ PLANCHART growth, an 
event might be interpre'teV as a buggy v^ersion of an existing leaf^^en 
only slight growth would have allowed it to match a new leaf exactly. 
But if the analyzer is too eager to expand the PLANCHART, the number 
, of irrelevant syntljetic solutions considered could be enormous. 

We plan to provide the analyzer with a number of strategies 

» ^^^^^ ^ 

fOI^ dealing with these problems. One strategy, which has already been 
introduced., handles the case where the relevant events are EQUAL; this is 
the unique-izing of subexpressions. But uniquei^zing is inadequate to 
, deal with buggy or equivalent versions. ^othfer ^trategy employs a hash 
coding scheme, where the contents of the buckets \ are pointers into the 
PLANCHART. 

««> ' . ' • 

Markers and Marker Propagation * 

\ * " ■ 

A third set of strategies for dealing- with^ the difficulties 

of ^'the previous section relies on a system of PLANCHART markings and 

marker propagations. The marker scheme is of Interest because it is 

also^ used to prqduce the final structural description, by selecting a 

subtree of the PLANCHART. The assignment of a data event to a PLANCHART 

leaf can be thought of as "marking that leaf. 
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Now * recall that the PLANCHART . is es$entially an elaborated 
AND/OR . goal tree. — Each non-terminal node, type represents an ATN 
state, ^each of which specifies either a conjunction or a 
disjunction of subgoals, with"" possible sequencing constraints. 
Consequently, we can allow markers, to propagate upward through the 
PLANCHART- according to three rules: ^ 

1. MPR-1. If ^the parent of a marked node is a 
^d^junctive type (e.g., CHOOSE), the parent is markedj 

2. MPR-2. If the parent of a marlced node is a 
conjunctive type (e.g., SEQ), and the^ siblings of the , ^ , 
marKed -node are also marked, the parent is marked (note 
that if th^r^ were constraints on the ordering, but the 
events appeared inf the wrong order, the siblings would 

' probably not have been marked); * 

3. MPR-3. If no lii^her plausibility interpretation 

can be discovered; under certain conditions a propagation , , ^ 
• may be postula^d when neither rule MPR-1 nor rule^MPR-2 ' j 

* is completely satisfied. (This third propagation rule i3 \ 
f designed 'to allow sftructurally ill-formed 

["ungrammatical"] plans to be analyzed, but wit^ lessened J 
plausibility.) * , * ' 

Top down MOD plans (see below) however, are handled specially. 
The solution for the top level problem should ^be propagated wllen it is 
finished, even though the solutions for the subp/oblems. hav^not yet been 
encountered; ' but the expectation for the subproblem solutTmis rermain 
in/ ' effect, and cause subsequent, propagations when they occur. This^'^X 

IT " ^ 

i|^ indicated by<jjsing two different marker symbols in la 

> The marker propagation status • is local 
interpretation and its offspring. Notice ,that it indicates wmLcri synthetic 
^su^goals are expected^ and which are satisfied* An upward propagation 
corresponds to what might be ^ termed a reduction In ^ a \bbttom up 
parsing scheme* The projiiagation of qiarkers is intended > tb.allpw the 




analyzer^ to efficiently draw inferences about the probable ^solution 
path represented by the protocol, with respect t6i^ particular asj^ignment 
of events, /- ^ . ^ / ' 



At intermediate stages in the> analysis, the^e tLANCHARTf markers 



The following guidelines follow 



■provide evixlence , concerning the plausibility of alternative 
interpretatibns. This is especially important when additional PLANCHART 
growth is \ under consideration, 
immediately: ^ 

PLR-t. An event assignment which would result 'in 
a propagation is more plausible than one which would not. 

PLR-2. An event assignment which woq^d result in" 
a long chain of propagations is -more plausible than one 
which would result in a shorter chain. 

^ PLR43. A completed interpretation (one which has 

interpreted (the final protocol event) which propagates a 

' ^ / 

marker to 'the- top level SOLVE node is much more plausible 
than orie which does not (a consequence* of the 
"reasonableness" assumption). ^ 

PLR-U. An event assignment to a conjunction 

/' * 

dominated leaf, many of whose si6iirigs are marked, is more 
plausible than an assignment to ?uch a leaf only a few of 
whose siblings are marked. A similar rule holds for* 
. plausibly marking non-terminal nodes. 
^ PLR-5. No leaf should be marked by more than one 

event. More generally, a node dominated by a marked node 
should not be marked. One ,exception is'that if the 
dominating marking was via marker propagation rule MPR-53 

^ ' (or -tb^ USE^ nodes of top down MOD plan), and if the new 
marking would have allowed a propagation via MPR-1 or 

< MPR-2, then the node may be marked. The other exception 
is tftat if the marking was - the result^ *of a -buggy 
BjSSignment, and -the new marking Is the correct version of 
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that assigntnent, the node may be markjed* 

PLR-6. Assignments which ^ result in proRjagation^ 
^by propa^tion rule MPh-j are much* less plausible than 
assignments which result in propagations by rules MPR-1 
or MPR-2. 

These heuristic guidelines help ^he analyzer to: (a> determine 
whether it is propitious to allow additional PLANCHART^ growth; (b) 
select the preferred ♦-interpretation for an^event; and (c) select the 
preferred .structur*al description of the protocol, which is a subtree of 
the final PL^CHART. . ' / > , 

The marker propaga^on scheme provides a precise, notion of 
expectations. A constituent- is expected to the extent to which it 
would result in propagations. For example, consider an Identification 
Plan for solving a subproblem. If the subproblem had previously been 
solved and saved in a file, it is' expected that a command retrieving' the 
solution %^om the file will occur. The PLANCHART vauld contain an 
unordered con junctiori of subgoals ,, one , to add a use of the solution 
to the subproblem to the solution to the top level problem, and one 
to retrieve the solution to the subproblem from the file. After an 
eyent had been assigned to the former, the latter would be expected because 
its occurrence would result in a propagation *at least as far as thQ 
Identification Plan node. 

^ .Suppose that an expectation (such as^' -the Identification Plan 
example") - fails to be/ satii^iBd after many events; One possibility 
is that the* partial interpretation which expecta^it is ^ust on the wrong 
track, and should be abandoned. A second possibility is t!hat the overall 
subgoal structure is correct ,4^but the subject has proceeded' to 
re-solve the problem via Decomposition or Reformulation, perhaps 
^because the existing solution had some undesirable property • ^ If 
this second possibility -^as in fact the case, then when the 
subproblem's solution was completed, the ' resulting propagation would 
•*turn off^ the aberrant expectation, since it woul^ then, be dominated by 

Si ' 

a marked, node. ^ 

' - - #' • • 
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A third possibility is that* the student/subject is actually 
using an ungrammatical plan. If a file retrieval i<s -Naot performed' as 
.expected, it could be' that the student simply forgot to do it, ' or 
thought .that it .was unnecessary, mistakenly Relieving that its 
solution was already ^ppesent ^n the workspa^^. .The fact that a plan 
IS ung^ji^njatical *does iiot make it unanalyzable, however. When 
the ^<3idC*J)f a solution to a subproble|j^ is encountered, some 
propagation ought to occur under every ACTIVE interpretation. If" such an 
event is followed by events which are analyzed as diagnosis, then thje 
^ost plausible propagation is forced, even if this is only possible via 
, ruleMPR-3* The plausibility of this interpretation will be greatly 
increased if the missing event, eventually does occur.as a result of 
sub^sequent error? correction. 

The Event Classifi^er ' * » • 

The event classifier module contains the syntactic knowledge 

necessary distinguish^ the various domain-specif ic event.typ.es. *The 

g , 

event classifier is one of the few components- of PAZATN which would ^ need 

to be redefined for each domain. In assigning an interpretation to an 

event, a varie^ of semantic and '^pragm^ic evidence may ultimately 

be considered by the ar^lyzej; \ut the domain-specific event 

classifier is deliberately ^gj^ictejl/to syntactic evidence ^nd timing 

data, for a few cases such as thos^^entioned earlier) . 

"^^^^ -e\ent classifier can be invoked in three modes. In the ^normal 

mode- (.which is usfed by^ the preprocessor) its input is an event, and its 

output iis that event's primary sS^ntactic class. Foi» most events, this 

is sufficient. The second \ mode ©fHDperation " is used '^by/ .partial 

interpretations i^hich fin4 the p/imary syntactic class of the event to be 

questionable, but have a specific alternative class under consideration*. * 

In this second mode, the .classifier is ' called^ with an event and a 

proposed- alternative * category'. The cl^^ssifier returns with, a nHaeric^l • 

' '' ' < 

summary of the. syntactic evidence relevant . to ' ;.the proposed 



reclassification. The third ^mpde is employed when the primary class is 
uestioned, but no alternative readily suggests itself. The . classifier 
. returns with an exhaustive rank-ordered list .of the syrftactic categories 
and their (syntactic) plausibilities,' 

Event classification " would be ^ performed using 

straightforward pattern matching. The ^details, being domain specific, 
are generally uninteresting and are not given here. 

The Event Interpreter and Event" Specialists 

The event interpreter is the module responsible for category 
independent operations of event ; interpretation. This includes the 
, context saving and restoratfion sequence described in the DATACHART sectiorj, 
pi^ actual processing required for marker propagation, and the marker 
status plausibility computalTions. The rationale for gVouping thes^ 
activities* into a ^ separate component is modularity^ they are routinely 
required, and common to every category of event interpretation. 

The event ^inte'rpreter is the "Inner loop" . '^of - ' tfie^ analyzer. 
It is 'invoked by th^ scheduler with two arguments: a handle* to a partial 
interpretation, and -a data event^from the prptooTpl. , In cooperation 
with- one or more event specialists, it attempts to^ expiafin that 4ata 
event in the cofttext of that partial interpf*?tation. , This may result 
in the. creation of ' one or more additional i^descendant) partial 
^ interpretations.. When event interpretation is complete, control 
returns to the scheduler.* . • ' 

A. collecJ-t'ion of domain specif ip event specialists [ESP^s] are 

I resp&tisibl,e for category * dependent operations " of i event 

^^^^terpretatipn. Each specialist, contains the requisite knowledge for 

analyzing events of a particular syntactic type. The event interpreter 

invoices an ESP with .an 'event (^nd an implicit assumption regarding 
V ' . . ... 

" its, syntactic ^ategory) in*the context of a given :partial interpretation. 

W'The specialist is free to assign any interpretatibn to the events which 



. i^ consistent with the categorizatipn assumption. However, a given 



specialist is not free to c'ons'idt^r the* possibility that the category 
assumption is incorrect, ' " - 

If the event specialist does not return with a sufficiently 
plausiWe'* event assignment, the event interpreter will then consider 
the possibility that., the syntactic category whicSb has been postulated 
for the ev^nt may be incorrect. Whenever an event is interpreted 
as buggy/ expectations for diagnosis and repair are generated at the 
request of the event interpreter . ' The details of the. ESP's for 
particular task domains are onot given here; examples of ESP's for the 
LOGO graphics domain are presented J.n [Miller &^G61dstein 1976d]. ' " — ' 



The Scheduler 



The^ remaining module to be considered is the scheduler. The job. of 



the scheduler is to drive the analysis tlfrough a fltst first coroutine 
search of the space of partial interpretations. • Ultimately ^ it arrives'' 
at one or more pl^usiblQ completed 'interpretations. 

The state of each interpretatioq is represented by assertions 
^in its context layer.. For example, one fact which the scheduler needs to 
knc 



"^about an .inyerpretation is how far alcfng it is in prdcessing 



the 



protocol < 



(Note that '-not. all interpretations are equally, far 



along.)' ^hi5 progress is represented by an^assertion of the form: 
(INPUTMARKERr <event#>) ^.^ ^ . ^ / 

which means that the input mafker is sittings immediately after the 
<eY^nt#> th* input event* ^ ^ ^. - _ ^ . 

Another-^ set of facts ^ which are needed are^^ the event^ assignments. 
These are assertions of the form: ^ r - ^ 



(ASSIGNMENT ,.<event#> <l^afptr*>) 



which means that the <event#>'th event ha? been assigned to the PLANCHAR^T 
; leaf referenced by <Ieafptr>*: Note that at most a few of these assignment 
assertions are expli#itiy present 'in a" given layer; the rest are^ 
' inherited from higher up in the/feontext hierarchy. \ J ' 

ERIC • •^■4 . . , > 



The scheduler. maintains ' three* lists 'of .partial 
interpretations (handles into the' context hierarchy) : the NESlist, 
the? ACTIVE list, and the HUNG listl Every, partial interpretation 
whdch has been discovered is on' one of these three lists. Typically 
interpretations on the ACTIVE and NEV^ lists are further along in processing 
th^ .input. Those on the HUNG list will not make progress unless a 
sufficient number of currently ACTIVE interpretatioTis become HUNG, at 
which time some HUNG interpretations may be reactivated. ' 

, The* basic diffiqulty which is faced by the scheduler is to ensure 
that interpretations which have a. reasonable likelihood of succeeding 
continue to maTce progr^, while those that are likely to fail ^ do 
not consume valuable resources. ' ACTIVE interpretations are**^ pursued 

in parallel, while HUNG interpre^tations are 'available should 

^> 

backup -become necessary. The size of , the -ACTIVE set is a global i)arameter 
of the analyzer. It should be chosen to^be just large ^ enough to ensure 
that backup will be infrequent, but not. so large that progress 
is forestalled. A fundamental hypothesis is th^^ the ATN plus the ' event 
specialists provide' sufficient information to cohstra*in the likely 
interpretations to a moderately si^all -number . - ' ,> , ^ 

The scheduler operates by cycling through the ACTIVE list, 
allowing each partial ir^terpretajtion to process one input- event. -Ther 



the plausibility ol^ each: modified jLterpretation"^ is ^^omp.uted, and U; 
AQTIVE' and^ AhuIigX^^ are updated. 4IEW interpreta^ns (result^g 

from the' splitting >f ^^ACTXVE inter pr^e tat ions bn the previous eye] 
afe ^automatically moved to the ACTIVE list, to ensure t'hatT t^ey receive at 
least one quantum af processing before fbeing HUNG. The plausibility of a. 
partial interjpretation increases with |each additional ^^vent accounted^ 
for. (This provides for automatic atteaaation of older. HUNG 
int^'il^^^tations. ) , - ' . . " ^ 

This .coroutlrie search process continues /until , at least, one ACTIVE 
intergretation has processed JLhe last input event ,^ith high plausibility. 
To be- highly- plausible, a finished; inte/'pretation should not have 
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dangling expectations, but be a successful solution of the original 
problem. If. the first successful interpretation. is not sufficiently 
better than every other candidate^ some of the better ' alternatives may 
also be pursued ^unti^^ become implausible or determine^hat in fact 
the protocol may successfully be interpreted in more; 'than one way. 
'SECTION IV • ^) . ^ ^ I ^ ^ 

REFINING THE ANALYZER . , " 

Overview of Refinements ^ o - 

This section examinBS two broad classes of refinements to th^ PAZATN 
protocol, analyzer's' basio design. The first-class is a set of 
elaborations to the slightly s^y^lified description of the previous 
section, which wldl"^ be included in our first implementation. 
The second ^<ategwy consists of some possible alternatives to the 
organlzatl9g;^:| presented here. Our purpose in outlining' this second 



category x^Ao provide tlrie reader with a flavor of the issues, ^involved. 

Our overall scheme for doing protocol analysis is to use PATN to 
generate expectations, and then to dffine^ a recognition process that 
attempts to match these expectations to a protocol, ^his parsing process 
can be refined by. utilizing several ideas that have j^rbven effective 
in probl^te~vsolving / and langua&e parsing > pr^grams^* inclu<^ing 
lcokahe4 [Aho- « Ullman 1972]), least commitment (e.g., 

[Sacerdoti 1^75]/ and differential diagnosis (e.'g., tRubin>'1975] ) . Some 
^of* these ha\/ parallels in tlie ^synthesis proc.ess. Here we examine their 
role in analy^s. - _ * x 

/ We also briefly examine^ some « techniques, for im^oving t^ie 
.applicab^ility- of the analysis spheme to use >n dynamic tutoring-^ One 
strategy; is to replace the expert ATN by a modified^' version, which more 
closely models ' the' ^^iHipsyricratic problem Solving behaVior of the 
individual, -^student. Another strategy is to* introduce pruning 
procedures to reduce^ the ampunt^ of storage required' by the analyzer.. 
Still another. is to provide heuristics for dynamically adjusting parameters 
of the r^ciognition process in accord with the pragmatics pf af "tutoring 



J Finally we explore a number of issues .related .possible alternative 
; design, choices* The possibility of organizing PAZA'TN as an analytic ATN 
•-..^X^'^^^ instead of as a coroutine searcher '^is discussed.^* This ^aptproach 
might offer greater clarity and modularity, decoupling mattefs, of 
efficiency from formal theoretical » concern^. Limitations of 'the 
breadth ^--of thfe synthetic thedry are also considered. Finally, the 
question^ of episode based analy^s — performing the analysis, in 
larger chunks — is raised. *^ ' * ' 

■ ■. . . ^ : . ■ .• ' • ■ ■ 

Lookahead and Least Gommitmept ^ . ' " ' 

Lpdkahead and leaSt commitment • are related search^ strategies 
designed to avoid premature decisions based on inadequate evidence, 
and the resul.tant * need to back up. Lookahead ^onsists of -briefly 
examining later events in th^ input string priof^ \o interpreting the 
\current event^ Lieast commitment consists of > postponing t a decision 
regarding the proper ititerpretation of the current event until 
further evidence is gathered from later events. A ^' -k> 

.Recall' that PATN as an AI expert system always engages tfi strict top 
down problem -solving. -The top level ^lan"' is completely defined 
.before the 'solutions, for- subproblems ,ar<fe at£empterf.\ Human^^ probldm 
solving is not ^this pnif9rro. Alternatives to pure top 'djown. planning 
nepd to be incorporated by allowing varia;tions oji the order ^ which goals, 
are pursued-^ - • ^ ' . ^ 

^A goal may b^e expanded ' befo^e' a" subgoai,' representing tbp down 
planning. Or, once the . needj. for a particular subgoal has ' been 
established, that' subgoal may 'fee expanjled before ascertairimg ^ which 
Jbtij^tusubgojals av4 needed for the j^iatn goal, representing^ jbottom' up problem 
solving} ^ Fig^6 11 illustrates . a top down expalision', while FigUr§^ 12 

I 

A bottXDm up or\jaixed solution order '-is a good example^ of the 
possibility for misleading mismatches between expectations and. protocdi 
events. Least commitment helps to minimize this. The' net effect is that 

" ■• • • o' ■ ' " 
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illustrates bottom^ up. , 




Figure 12. Bottom Up Exp€tnsion 
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at those d ec is ^6r^ points where the choice is essentially arbitrary (such 
as in the particular sequence for accomplisKing a SET plan) *PATN generates 
a disjunctive set of possibilities, rather than making an arbitrary 
selection. Thus, at any ' point in the parsing processVV a set of 
alternative expectations may be present. This avoids a blind depth first 
top-dovn analysis, and reduces costly backup. 

We have already seen some use of these techniques by PATN. The 

^, r> >. 

primary application of least commitment, in the synthetic componept, 
is the avoidfance of arbitrary ordering decisions. ^ As. currently 
designed, ^TPATN pan optionally' be instructed, to produce procedural 
nets [Sacerdoti 19J5], Figure 13 illustrates how purely sequential 
solution procedures, unlike procedural nets, overspecify the v ordering 
constraints. The virtue of the proce*dural net representation for PAZATN 
is that, when an ordering would be arbitrary, there is no reason to expect 
the student to choose the same path as PATN. By postponing the 
decision, a greater number pf interpretations can be implicitly represented 
by a single PLANCHART marking. ' . \. • 

f 

Examples of the techniques occur in the analytic component as 
w^Hi^Some difficulties which are encountered ir).^ designing event* 
s^pecialists, for example, can* be resolved -by the use of demon procedures 
[Charniak 1972]r^In certain situations a demon would be created* to 
represent - an '^event assignment which depends on subsequent events. 
When the ^ relevant events are finally Encountered,* the demon would Jthen 
fire*, ' coafpleting the assignment ' on the 'basi^ of the' additional 
information. . . ' ' *. . * 

One effective application -of least* commitment in the analytic 
component is the sharing o( ^ubstructures ,in the PLANCHART . This 
allows * ambiguous colleations of event assignments ~ those whicl^ 
have more than * a single structural description ~ to be econtmically 
stored. ^ Rather than committing the analysis to one or ' another 
structure, the decision is^postponed until some event, provides evidence 
eleacly favorlr^ one or .the othei*. \ ^Implementing this poITcy does not 



— -Figure 13'. Procedural Nets versus Sequential Procedures 
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A Procedgral Net For Building A JOwer 
After Criticism to Resolve Conflicts 
[Based on Sacerdoti » 1975, p. JS] ^ 



require special actidn. It is an^aulomalic consequence of- the analyzer's 
. data structures* 

PAZftTN can also benefi!:' from a type of looks^head which has not been 
presented so far. Previously it was claimed that PLANGHART growth was 
to be " limited to those cases in which a plausible active 
interpretation could not find an acceptable assignment for its "next 
event ♦ This statement was ,-an expository simplification and is not 
stri/2tly true. * - " , ^ . 

The^ primary , objective of PAZATN's control structure is to caus^ the 
strongest sources of constraint to^ be utilized firsts This is tp prevent 
unguided search in a potentially large space. Thus, when there is clearcut 
bottom up.gv^dence of^a particular constituent, that evfdenc^^hould bd|' 
examined. Likewise, when a tpp down decision is^ -straightforward, that 
route should be pursued" py^ior^ to malTing leSs certain analytic 
assumptions.- ' ^ ' ^ 

Therefore, instead of severely • restricting pAtn's -activity, 
as previously stated, we actually intend- to allow it some freei^om to 
exploit strong sources of top down constraint. Some synthetic decisions 

• are virtually forced by the form of the model. There is no reason to 

' \ - X ■ 

interrupt PATH., wherl it is about to m^ke such a decision. This can be 

•viewed as a type of looliahead, in that even before the event' 

interpreter has «hoticed_" any deficit, th^ synthetic componen't has 

predicted the necessity for — and accomplished — ' appropriate PLANGHART 

'growth. . ' ^« , . . 

PAZATN's' ^knalysis -rprocess ^is ' actually , desigrited to^ begin by 
synthetic examination /of ithe model. This top dbwn investigation 

proceeds until"" some^ decision point is reached *for which the synthetic 

... " ' ' 

^as^s IS uncertain in some fundamental way. At that' po-int, control 

'switches to the analytic component. Likewi,se, wheney6r the ATN is 

invoked, it is allowed to proceed so long as its choices follow ^ from 'firm 

criteria. -This reduces the overhead* of constantly switching between 

event interpretation and plan synthesis Operat?.ons would /pf^eed 



with . fewer iliterruptions, in slightly larger units. ^ 
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^^Despite its vir^tues, though, l^ast commitment could be overdone^ 
The result would be such a large, .disjunction oT expectation^ that no 
guidance* could be obtained* ' Mot*eover, the relationship between the 
system's formal model and the student's intuitive model is tenuousL 
The .anaiyzer strikes a balance betwelen overly committing itself, and 
stubbornly refusing to take^'- decisive action* , ^his is accomplished by 
.avoiding overcommitment in the coUrse^ of a given decom^ositAOti * strategy, 
but requiring ..bottom up evidence/ to change the formulation of the model*. 
The next section describes the differential diagnosis knowledge' that would 
be used to request such reformulations^ ' - 



Differential Diagnosis ' - 

— ^- We h'kve already- encounteped a-use-of^ demon procedures .by the 

^ " , , ' \ • 

analyzer; this was to handle the problem of tVe ass.ignment 'of\ a giv^n 

' *• * * • . « i' 

eyent .depending primarily on the assignment of some future . ''evppt* 

Another use of demon\, which we did 'not consider, is to perform 

differential diagnosis in deciding between ;tVo interpretations, or , in 

recovery of an^^ appropriate ' explanation when a given approach becomes 

•hung* In those situations Where *even the u^se of least commiti^nt 'fails 

to produce a ^uccessfii'lf set of expectations, differentials '^^gnpij^ 

knowledge sTiould direct PAZATN to produce' % new set of expl|,^t^|^i3 

There are^ two Situations where ditferential diagnosis is apprbprf^^V^* , 

One is the use of explicit a diagnostics* for s^unsuccessful catef^^^i 

assijjf^ents* ' ^ -T-he secpnd, and most significant ,^ is , this 

reformulation of the problem description^ to achieve consistency , wktji 

bottom up eviden6i§» # ^ r ]^ 




_jr first ordei? description -of the event specialists, ' we iiiiposed 
the stringent requirement that no specialist .ever . consid6rv:: the 
^rapg^ipabilT^ of. an6ther specialist; tHis job was deft .'tof the civ.ent 
\interpreter?>. ^ Sometimes this requirement .can be artificial! When a pieces 
of category^specific knowledge is able to diagnose the appropriateness of* 
some, -other ESP, then, that' piece of ^ kfnowl^dge belongs WicAirr- the 
Specialist for that category. , " \ • ' \ 



Likewise*, diTferential diagnosis is used toV^select the proper 
subset of a ^di^sjunctive set of ^expectations (such as is produced ^tiling 
the least commitment policy) » Conversely, when none of the alternative 
expectations matches the protocol, ^ the* analyzer requests that VaTN 
perf6rm a reformulation consistent with that evidence. The following 
are some examples of demon, templates, which can be Instantiated to 



realize this. Behavior in specific sitvati9ns* ^ 

DDR-1. If the. current protocol segment uses a 
named subproblem whose model has been firmly established, 
and 1 if that model corresponds to a disjunctive subset of 
. the , current expectations, then select that subset* If no 
expe^ctation corresponds to the model of this segmept, 

' J!fis?f-ormulaJ:ce._thq current problem description in such a way 

that this model is among the expec^d subgoals* ' 

DDR-2. If. the effects produced by. the current 
protocol segment mat^ch a' disjunctive" subset of the 
current expectations select that subset.^ If not, 
consider a reformulation t.hat useS a model satisfied by 
the segment effects as a subgoal. (The possibility that 
the^ current segment is an * error must also be 
considered*) / - ^ • 

DDR-3* If the subject states that the current 
segment corresponds to a certain subgoal, select thati: 
subgoal* If that sCibgoal is not among the current 
expectations, reformulate th'e model so that it is* 

A r 

DDR-4*. If the. current segment accomplishes the 

' r # • ' ] ~* ' 

effects of an expected subgoal, but hot by.a plin that 
matches current expectations (e.g. via different control > 
'^^ structure) then reformulate for this part, in term^ of 'a « 

model corresponding* to the control stjructure observed^ in 

f 

Tf mhe protocol.'' Generic/«xplicif conversion [Miller & 
^Goldstein A^i6b] could be handled by, th,is rule, for 



instance* • - , ' 

•? 

DDR-5* If the effects of the current segment 
violate only a few model predycatea^Undeh the current 
interpretation, ^ but the segment has a sub-segment 
structure that does not correspond to expectations, then 
reformulate." ^ If- there are too few segments, try 
regrouping 'into cogipound parts ♦ If there are too many 
segments, try di3ecting model jp/arts Which contain 
multiple sub-parts » 

9 * 

This list is not exhaustive. However, it does suggest how 
differential diagnosis demons could be useful in refining, the bas4c 
analyzer, 

Tailoring the ATN to the Individual . ^ 

In previous sections, it has been assumed that PATH is a spanning' 
model, in" .other ^words', that the ATN is capable', of exhaustively 
en-umerating the space of reasonable problem solving behaviorsSwithin its 
chosen domain). To this definition is added the caveat t>iat 
"irrational bugs" such as typing errors are often understandable as^ bugg^ 
versions of one of these intended synthetic solutions* . 

It might ^em that the caveat leaves the deTTnition so weak as to 
be vacuous* But it is -nt least thinkable, if not probable, that some 
human problem solverg might display genuinely irrational intent. This 
ddes not refer to deliberately trying to nri-sleiad the analyzenl— "hacking 
the system". In JBATN terminology, such problem solvers would have -a 
deviant Their protocols would be more difficult, if not impossib^, 

to .analyze. ^'^ . " , ' ^ v* 

' - In what ways can'^n ATN be incorrect? One error woul'd be to have a 
variant of the optimal pragmatic arc constraints. A characteristic 
ex2unple would ^be an ^ ATN with ah ^overly developed ^dritic on the linear 
planning arc." A problem solver., having encountered several cases" in 



which an initially linear attack led to /bugs, might, reach the general 
copclusion " t.h^t glL pAobleins . require *a' non-l^ear . approach • 
Consequently, any problems which appeared to be linean }nigY\t be 
reformulated to ensure the introduction of non-linearitie^» 

Such\ an approach ^ of course, misuses the valu^e guid'^ce " in 
understanding the complexities of novel tasks, which ,is o\:|'ered. 6y ' the 
failure of the linear pla'n. This quirk is cdmmon among oov^ices in 
^he .^^^pi^grawra domaiir, for example^ Relations-, which by, ail^ accounts 
of "syie" in programming ought to be;' accomplished- via 3rn interface stfep, 
" vill .be accoi^plished as ^t or the \lg*f,inition- of. an ^jacerit ^jnain /step.' 
^ For example,' a WISHINGVIELC,: is ^defined as a TbP, a TOt'E, and'^a WELL^ 
' 'where the setups for each are incljujfed in the subprocedures, ^ 

More serious would be to have, missing, or/ extra "^'ar^cs. ^\'*^o^\ze 
programmer, whose prior Experience ^ wa« in the BASIC lan^d^e, 
would probably t>e missing the reeursaort ar^ c^for achieving rou> 
plans* Consequently all problems involving generic models wou^ld be ^ 
solved by iteration. Those problem/ for which iteration is tfuly 
inadequate, such as drawing arbitririly deep^ binary trees would be ' 

V ^' . ^ ^ * ^ 

unsolvable, ' ^ . \ \ • \ 

EVen more catastrophic wofild be to 'have mis'sihg^ ^or extr^ Abates. 

Suppose one wished to'^ apply PAZATN to the analyst's of protocols •Hbdueed 

by some other Artificj.al ITitelligence pr®gram* "It is* ^ikely that 

reformulation would not be' one of its solution techniques; the ' Tel-evant 
r . • ' ^ . ; > ^ 

/ states would probably be/ • *mis3ing entirely. - - ^ . ^ 

J Moreover, the. class 6f '^rational" bugs should really *be/ seen as 

relative feb/ the problem solver's computational resources. ^ §ui5po«K' there 

were certain systematic limittiions on the ATN, such as an uppe^ bound 

.on the\.size pf the. structures contained in (o\ pointed to'by) its 

registers. Some bugs which fprmerly might *have" been termed "irrational**. 

" in that they might' have been avoided by consulting the critics ^-gallery 

for example, become "ratiohal^H T^ftis/, is^ bfecause a plaa involving 

•oversimplification, followed^ by debugging, ^ may p,lace//less' stringent 



demands', m, the limited. resourQe. , Rationality, .by definition, is 
measured with, respect to some estimate of utilities, costs, and risks. 

Very likely,, it is possible -to handle most protocols produced by 
such non-Ideal problem solvers without significantly modifying PAZATN's 
design: It is easy, to generate example solutions which PATN would be 
loathe to produce, but which PAZATN, using the PATN 4TN, can nonetheless 
undersL.imk Whether compel 1 ing . ounterexamples ( aiT be found .is an open 



question . 



Nevertheless, a drastic reduction in search would result if the 

problefto solver's quirks* were\urned to .advantage. In tutoring the same 

Student day gfter day, for example, consistent failure to use *a certain 

type of plan should suggest to PAZATN that it is pointless to continue .to 

look for it (except perhaps as a la«t -resort), Consequerftly , our 

intention is* to, replace the expert ATN by an ^ idiosyncratic version 

tailored to the ihdivijdual. Once such an 'idiosyncratic ATN^j. has been 

constructed. It can also be used, in/tutoring applications, as a ^t\ident 

model for the selection of tutorat>le^ssues.' > > ^ 

* f • ■ * 

Further Improvements in APDli Gabilitv to Dynamic Tutoring 

, Although an automatic protocol analyzer is a valuable tool ' in its 
own right, the , authors are particularly concerned ' that- PAZATN's 
structure be amenable to- applications involving real time, on-line 
tutoring.^ This constraint imposes strong limitations bn . the design, 
most notably the restriction that events be^ processed in a single 

^pass in approximately left i^ righ^^ . order. Moreover, the system^' 
must.^^be suf f ici^etitly. responsive so as not to interfere, with the 

'student's progress. Naturally this consideration is less^critical in 
the eac post fa^o exhaustive study of the protocol for theoretical 
and ejcperimental purposes. ^ \ ^ ' 

To these ends, this section 6opsiders -additional improvements 

• to- -PAJATN. --The' -tailoring of' the ATN tcf;the individual, . diacOssed 
in the last section., is one improvement. Two further improvements are 
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presented: One is the introduction of pruning heuristics to ' vreduce the- 
amount of storage required by the , analyzer . The other aspect is the 
dynamic adjustment of key paramet^rs^^^ recognition process, * <bo 

increase the system's responsiveness witljout degrading - the accuracy 
'Of its interpretatiojis. 

In order to assure reliability and t|ie capability to recover from 
initially ecr^orTeous interpretations, PAZATN keeps a rec^ord * of every 
partial interpretation- which has been discovered. These are kept on 
three lists: NEW, AClilVE, and HUNG. Furthermore,; every'local 
ambiguity can potentially cause PAZAT(J to * save the state of' the 
.interpretation, in the'event that splitting this interpretation becomes^' 
.'Yieaessapy; This cautious style might result in a very long HUNG list\ 

Orfe techniqije^ for - dealing with, this contingency us 'to provide 
heuristic^ which' reduce thB amount ' of unnecessary splitting* - The 
avoidance ^f overly cautious saving of states and splitting of- 
interpretations is not a complete solution, however. * Unless^ reliability 
is dangerousjLy sa«hificed,, .there . are inevitably ; going td^a 
.substaatial number of local ambiguities for which these precautions are 
required. Only after examining later evidence' will the doubtful status 
of other alternatives, be firmly established.^ Furthermore, it is ^ not 
enough^ that such low plausibility interpretations/ c^ase to consume 
processing timQ.. Their continued existence / implies that the analyzer 

will be "hangitig on" to large quantiti/s \ of Jstorage in Hhe form of 

^ ' ^ / / * ' 

assertions in (fONNlVER coatext layers (or*tl/eir equivalent). , ' 

, For this Season, PAZATN should inOude a 'mechanism for pruning. 

very implausiliie interpretations. • The /wiximum allowable size of the HUf^ 

list, HMAX, is.^ parametep-of ^the systeniy Vhen 'HMAX *^is exceeded,' the 

.lowest plausibility interpretation //(s deleted. This* is based on k 

. r ■ / / f ^ 

heuristic^ assfilnption that, at ' most/ HMAX interpretations will have 



sufficient plausibility to warrant fui/t/her consi^ 



— ,^||^6ration. / 
,y possible . that a pSlfi 



Z"^' Unfortunately it i^ -entire] 
layer has non-prunable offspring 



This 



able contex^t 
!is possible because the 
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prunable context layer implicitly represents- the- set of '(typically 

implausible) alternative interpretations other 'than those explicitly 

repres^ted by its (typically more " plausible) offspring. Since these 

' offspring are inheriting assertions from the pj^unable interpretation, the 

garbage collector will not be able to reclaim its '^pace, except in 

the case that all the offspring have also been pruned. 

Fortunately, most context layers would probably- have exactly one 

subcontext Thi3 is because th^ typical event would be sufficiently 

ambiguous, to warrant maintaining" & potential for splitting, but ,not so 

ambiguous to cause, any other alter^iative implicit in -the parent context 

to actually be pursued. The pruning pr6cedure is designed to detect 

this situation.^ When a context layer* with exactly, one .non-pruned 
» 

subcontext is selegted for pruning, this indicates that *the subcontext may 
be finalized. ^ Consequently, the parent context layer may be spliced out 

the hie^rarchy altogether, and its space reclajjned. This helps to 
impose an upper bound on the storage required byPAZATN, 

-We now ^ turn our at-tention to" another .potential inefficiency 

bug in the current design of PAZATN. ^his is that ' the .size of the ACTIVE 

list required to prevent \ frequent back up may be large. If so, the 

system could simply ,be tod slow for practical use in tutoring. PAMTN 

requires some .technique for increasing the responsiveness of the 

" ' * ' . * - ^ . 

system, while maintaining 'the effective size' of the ACTIVE list-., 

The solution is to dynamically' vary those parameters which 

determine the size of thi? list. (Tfie -actual size would b§ determined by 

a number of , factors, including 'minimum- size, maximum- size, and 

minimum plaiusibility * for ; inclusion.') ^ The capability Tor varia.ti9n 

Wi^uld /allow PAZATN ta carry along a ^mail v76rkj,rig siet'^^of interpretations 

when the student is rapidly typing.^ Whenever s^e student pausdd'to/ th.ink 

or r*e3t,' the higher pliausifeility HUNG laterpretations could be ^updated. In 

this way, should one of these bje; reactivated .later, less back bp would' be 

reauired» , • . ' ' - * ' 



, . An elaboration of this refinement takes advantag^ of the primary 
underlying reason for avoid^ing back up. The greatest danger of backiy) 
in the tutoring application is, that some previous suggestion or 
oritib.is^ may turn out to have b«en inappropriate. This danger 
<^&C be reduced as follows. Naturally, the system should always 
require a high degree of confidence in Us interpretation prior to 
intervening. This should be supplemented by filtering any remarks a/ 
a.s to be • appropriate ,. under all . reasonably^ plau^le 
alternative interpretations. (Introspectio/i, ;3uggests that/^ human 
tutors employ a similar heuristic.) • , X • , 

Furthermore; immediately prior to 'the remark, xke size of the 
work'ing set Should be increased, and the reac>imed interpretations 
brought up ^o_ date. ^ It should then be verified that those marginal 
interpretations are unlikely to ■ invalidate the j)lanned remarks. This 
implies that norma>^ the system would .be highly responsive; - but if delays 
were to be experiencied , they would occur only when the student was about" to 
be interrupted for tutoring anyway. 
. . , • 

Design I ssues and AlternativeH ' 

! / 

The careful reader may have noticed that PAZATN is somewhat 
independent of 'the detailed form, -of the .-synthetic formalism. 
Although tremendous leverage for analysis is obtained by.the population of 
an • effective synthetic theory, little ^use is made of, the fact that 
PAIN. is specifically organized as^^^rAugmented<aransition Network. • For 
example, the possi^tilityth^t the debugging' component is organized 
differently has not beefT^ompletely: ruled out by anything which has been ' 
said^ so far. - 



• .It does make a di/ference th|at the synthetic component plans- and 
debugs by makirig a ^series of pragmatic choices, which can* be sumraapized by 
thX tree structured PLANCHART. Furjthermore, it is essential that^ |,the ^ 
system '^s -.capable of generating, not one solution; but an entire space 
Qf^..,pr|)gressivety:- less , fav.ored solutioa , paths.- .Alsa,. ^.an, - impltc^i-t, 
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;j*5.sump.l ipri runs throughout the analyser's design that the linguistic 
analogy: is fruitful — that the solution path consists of structural", 
semantic, and pragmatic *; elements* It may ,be .tliat an-y ^ synthetic 
for^mali^m satisfying these constraints is trivially equivalent to ^an 
ATN. - Such questions are nptoriously difficult to ans^wer. 

II rs pfobably a virtue that PAZA-TU is somewhat, decoupled . fro'm^ this 
issue, y but one could construe it as a- defect. " Ond could' argue that 

.'somehow the d^sigof of the analy^zer may be failing tO;^^ take ,tfull 
a;dvantage of the claims' of the^'theorjr. A pqs^ible Alternative design- 

■"TSSQl^ be to organize PA^TN as aiT- analyt ic, version -'of ,the' ATN. Thia 



"AATN" woifld have numerically valued arc, 'conditions, representing ttje 
plausibility computations of the .analytic 4)raigmatics,. - Note that the event^ 
specialists are to be organized internally as decision £rees. It is onl/ ^ 



small, step to reformulate this decision tree structure as a subgraph of ^ri 

,It^ might seem that ^employing an AATN instead of a coroutioe 
searcher might commit the analyzer -to. a less powerful .automatic backtrack 
type of control st^upture-. This:: not pecessarily .the /case. 

Depending upon the implementatioTi, th^ ATN formalism per se carries no 

irrevocable Control structure-\assUmptidns. One qjay traverse the 

' \ ' * > ' ' ' ' ' A > - ' 

diagram according to any af a wide variety of search strategies.^ In this 

respect, the/ AATN* would be attractive,' offering greater perspicuity ky 

'decoupling efficiency issues from theoretical. concerns. 

Nevertheless, the AATN -design for PAZATN has not , been .pursued.. 

Although *iX ^is possible , , in principle, to employ a mixture of ^top 

down^^ and bottom up strategics with sgn ATN, it Is nfoYj'e nat\iral to 

conceptualize an ATN parsed; as^^a top down- backtrackgr. To understand 

• * * • ' ^ 'l ' ' * 

the'ir bottom up use, PUSH arcs, must .b^ \thought of . as / /"IF-REDUCE" arcs; 

POP arc^ ^ust be thought .of as "PEDUCE" arcsi Thi^.feYt counterintuitive. 

i .'"v^ ■ . ^ - , ' ^ 7 / 

Ap important \issue in the -design ^ concernsYthe breadth of the 
synthetic* theory.- There* are of' course •particular lacunae, .such as 
^cpnditippgl" , «plan?,^ which haye,,.^beeo |J^^ib§rately, / l)ut,only temporarily, 
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ignbp'ed^' The .greater th'r.ea,t cbm^s from the-'unRnown . ' Even the . youngest 

• ' ch-ildreff.--di$^?ay an i^redible ' Richness , Tn"' their ^problem solving 

behavior. PATN 's^^ origins - are at \' least partly ^empirical. Byl some 

• phenomei^ ,^j)5fhaps those most in need of investigation', may have been 
^ -lost in the process o.-f formalization. .This^' .^ remains 'a topic 

, for investigation.* '* ' .. - ' ' * ' • 1 

• , .■ ' • ■ # . • « 

A Cinal design issue warrants^ mention' here. PAZATN'^op^rates ",by 

andividualj'f proaess^ng each'e,vent. But perkjaps this leads 'to -tio lotfal a'' 

- perspective^ Perjiaps large'r ^i*zed' chunks of protocol 'should be-.^ifSrflned at.* 
onoe,. Irt^^ other word^,,. ah . episode based analyzer might.I.be .preferable. 

, eThe event? based design- ^ -has been selected because it is the - simplest 

most strA-ightforwar.d appro^h'. • ' 
" section/?,-^' <, ^ -i " . . - . ' . ' " 

TENTATIVE CONCLUSIONS AND PLANS FOR FUIORE WORK . • . " 

, RecaDi|ul'ati(; ) n .■■.-* - . . ° ■ . . ^ 

In this rfeport we, .have investigated the problem ' of analyzing 
problek , solving protocols. The result- of . this' investigation " i's .a 
preliminary design for PAZATN, a^ domain independent fr.am6work for' 

^ automatic . protocol analysis. The. foundgrion for th4 approach was a 

- gpammat'ifcal theory of -prob^enr'solving-as' a structured procVss of planning 

- and d|bugging. -This lead us to the definition of an interpretation 
as an assignment of a structural description to .a Tist of eVents , 
augmented by semantic and pragjiiatic annotatipn associated ,^ith each.n'ode.- 

. The foundation for" the approach was. a grammatical theory of prpblem solving 
•as a structured^rocess of planniTig-,and dejfeugging.' This -lead us- -to the, 

' ■ , • . ' . ■ 

definition of an interpretation a&. an assignment of a. structural 
description to a list of evehts, augmented^' by semantic and pragmatic? ■' 
'annotation associated with each node.. ^ . ■ 

• _A key ingredient. in the. design is a synth'^tlc problem 'solving 
\system called p;iTN. _ PATN employs kn augmented transition network " to 

^epr_esents fundamental : planning - , -concepts, \lnciuding, techniques ^of'- 
Identification, decomposition,,- and:.. «»eformljlation. ^ PA-ZA'fN is somewhat ' 




decoupled from 'the ATN representation per se > However, considerable 
leverage for the"" analysis process is obtained from PATN'is abilitjj^ 
to generate successively less preferable .solution paths, by a ;^eries 
of pragmatically guided planning decisions, ^ as, well as from PATN's 
charabterization of tiertain bugs as errors in tljese planning^ choices. 

The analysis procedure has been * designed to obtain maximal 
advantage from ^oth*top' down synthetic guidance and bottom up analytic 
constraints. Analysis 'proceeds by a coroutine search of a space of' 
plausible partial, interpretations. The. PLANCHART, a data structure* 
resembling an AND/OR goal tree., is used to keep track , of synthetic 
expectations* B^- careful selection of the representational scheme, this " 
structure achieves considerable ^ storage economy.. It is' incrementally 
expanded by the syntheii^s ATN when existing expectations are inadequate 
in view of the protocol/ data. ' The DATACHART, a data structure 
analogous to, a context layered CONNIVER data base, is used to keep 
track of the state of alternative partial interpretations. 

The analogy to computational linguistics has turned out to be 
fruitful,, providing insights . into the parsing' process develoijed in 
re^a'rch. on, language understand ingo ""and speech recognitioh. The 
yalue of this analogy is illustrated by the. adoption ' of several"' 
search ."dtr'&tegies and representational technique^. f*or -example, the 
chart ' representation is utilized to economically store well-formed 
substructures.. Partial* knowledge ^f structure and af the status of 
synthetic expectations is 'recorded using a scheme of* PLANCHART* • 
markings and marker propagations. These would allow . Tor considerable 
efficiency both in storage and in the drawing of inferences regarding 
possibly ambiguous structural descriptions. Likewise, the basic outlines 
of PAZATN ' have been refined by .the incorporation . of search 
heuristics prevalent in computational linguistics, including lookahead, 
least commitment, and differential diagnosis. These would allow the 
analyzer tb proceed with reasonable .assumptions when necessary, and yet 
modify- its^ interpr^ation . in response to ^nom^ies. ' Ideas— for 
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replacing the "expert ATN a versiort tailored to the individual /were 

• ~ ' — ^ ^ • 

discussed. Major desigr\ issues and 'alternatives were also examinea. 

» , , — _. ^ 

. Although PAZAXN ia not yet a vorking program, the design is 
sufficiently specific so 'as to be hand simulable. . The aext phase of the 
research, is to implement and ^experiment . with^ ^ a^ prototype analyzer • 

G&nerility of PAZATN V' v ' ' ' 

^The design of PAZATN is of interest in that it suggests A papadrgm 
for protocol analysis which may be applica^ble to' many domains; Although 
an ^operational PAZATN system for a particular task domain requires 
considerable domain sp'ec if ic knowledge — a necessity if significant 'power 
is to be attained ~ its knowledge "is extremely .modular^ This' domain 
.specific Knowledge* is restricted^ to the event classifier, the event 
specialists, \he lowest levels of PATN, and the answer library. The other 
modules of PA-ZATN, which h^ve been emphasized in this report, make no 
domain specific assumptions* in their operation, this suggests tHat PAZATN 
sys^efts could ^ be constructed for a variety of domains by supplying \ 
y^jlug-in" modules for these* domain specific components; 

In our early work, a ^xt by Donaghey & Ru^del [1975] w^s found 'to 
be useful ^in organizing knpwledge. of • elementary algebra • into procedural • 
rules. It was* found^ that many studen^ts demonstrated an- understanding of 
the rules, and often were able to apply them correctly. Their* hardest 
problem* was to ^ recognize ; the appropriateness of a given rule to 
a particular problem situation. For example, in actual student 
protocols, it was observed that students- would - multiply out an^. 
expression, and then, only a few lines later, factor it /again. This 
haphazcfrd; application of inverse operations inevitably leads to careless 
errors, by incr^eksing the^^ength and subjective difficulty of the t,ask. 

These algebraic rules can be modeled by A PATN-based synthetic 
problem solver. Each algebraic transformation operation . c^n be 
a9S0ciated . with an" arc ^transition oh an ATl^ subgraph'.; Associated ' 
with, e^a6h transition is a set of semantic and pragmatic constraints on its ' 



applicability* Fdi; example, to follow the . f&otoping ^rc, ' the' 
^semantics require that . the ?EXPRESsI0N register to be, a Doiynom'ial in a 
single variable wit^ numerical* coefficients^ Ttte pragmaticsTiiSdicate 'that 
this IS an appropriate transition ^hen the goal is to determinev^ the roo.ts 
of the polynomial (see Figure U). , While , many students 'will ' have 
learned the syntax of the transitions, wl\ich is usually ail that is 
taught, - their ^weaknesses oftign * lie in not ^knowing the appropriate 
semantic ^nd pragmatic 'constraints* ' , . ^ ' 

A feature of programming ^environments , which has been helpful in 
thinking' about the PAZAtN'system fon that domain, is, that a .gr^at deal 

^ r • i 

of the. student's reasoning is manifest in th_e protocol* . Not v all 
CAI environments share this property. PAZATN wou*^ have 6ore drfflculty 
with domains for which the "bandwidth" of the analyzer's window into, the 
student's thinking ^s. low* ^ This might be a problem in applying the 
paradigm to WUMPUS [Stansfie-ld and ' Carr 1976], WEST , [Brown and Burton 
•1976], or SOPHIE [Brown et al» 1976]- For example, in' the electronic 
troubleshooting scenario, the student requests 'a particular 



measurement, but provides no indication of the pragmatics 4- the 
reasoning which led to that . measurement raJLh^er than another ^ Since 
there are m^ffy, routes ^ by which the misguided troubleshooter could, 
have arrived at the Requested measurement, a precarious chain of 
statistical inferences from multiple trials *is required to pinpoint 
the student's underlying -confusion. 

Probably this would pose ppeblems' for* any analyzer. Hence, the 
extent to whj-ch the student's reasoning, is articulated suggests itself . as 
a • dinr^n^ion along which to evaluate 'designs for future CAI environments. 
Note ^ that this Ts 'a property not only of the -domain, but also of the*^ 
particular scenario used. For example, in * the electronics domain; one 
can envision a design scenari(3 which would closely mimic the alleged 
virtues "of' the progranjming world. (It would be essential to contrast 
the reasoning strategies * required fon debugging an^erroneous design to 
those needed for troubleshooting a faulty component i/i ,^ properly 
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Figure 14/ Subgraph of Algebra ATN 
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designed , circuit.) Another possibility is to ask , the student to 
explain .his reasoning. The major stumbling block to such an undertaking 
at the present time, lies not in inadequate 'theories of problem so.lvirig, 
but in the understanding of natural language. 
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• APPENDIX I 



RED T^EST 



Student 1 : 

. 83 
4-106 
189 ^ 

Explanation; 



• 330 
+187 
. 417 



89 
•+132 
Til 



35TX {• 



+69 
3t3 



Student 2 : 

94 . » 
+115 
119 

Explanation: 



498 
+215 
611 



77 
+26 
91 



48 
89 



Student 3 : 

347 ° 
+139 
A76 

Explanation ;'. 



758 
+296 
944 



437 
+284 
601 



923 
+481 
1404 



4 



Student 4 : «■ 

109 
+452 ' • 
'501 

Explanation : 



' 98 
+105 
103- 



98 
+111 
209 



35 
+64 
99 



i 



Student 5 : 

352 
+18 



360 
Explanation: 



784. 
+3080 
6364 



1784 
+3080 
' 7364 



8- 
+35 



ERIC 



Student 6 ; 

8372 
-657 
6725 

Explanation : 



6527'/ 
' -2394 
3233 - 



, 893- 
-195 
608 



'111 



102 



63 

. -4i- 

16 



Student 7 ; 
^13 "~ 

777 ■ ' 1. 
Explana trLoii : 



5394 
.-797 
4497 



477 
-284 



101 



893 
-195 
718 



Student 8; 



394 
-166 



.Explanation: 



Student 9 : , 

48 
-15. 
.' 43 - 

Explanation ; 



77 
-53 

^4- 



394 
-166 
340. 



935 
-361 
774 



57 
-23 



60 



126. 
-117 i 
29 



239 
-95 
124 



Student 10 ;' 

305 
-108 

,107 

- ExplaYiatlon : 



987 
" -3 20 
j67 



340 
-56 
290 



9280 
-6090 ^ 
3090 



Appendix 2 



Li^^of all responses*to the question:^ 
What do you think you learned from this experience? ^ ^^ 

I see ifrom this system that you learn from your mistakes. Iti ascertain 
ope^ration there are -so many mistakes that you can make. When you lea<rn 
what the mistakes are you learn to do the operation correctly. 

Tha^ ^chilctreni^s errors can be a way of diagnosing the way the child leapns 
material. Also i-t raises questions about the way a child is* tested, both 
•standardized and informally. ' ^ ^ . ' 7^ 

A 'Student's errors and/or misunderstanding of a concept may have not beefi 
due to carelessness but rather involved a complex and logical thought 
process; - ^ • 

I lea;rjied th"at it is necessary to try many different types of examples to 
be sure, that ^ child really understands,. Different types of difficulties* 
arise with different problems. - ^ . - 

Trying to bejat the machine can be,^ challenging. Feedback is extremtely 

important in trying to determine the ^nror."^ It's difficult for me 'to 

describe .the error but' the machine doesn't care as long as I can prove my 

point thrdugh e>camples. ^ . . x 

Although it's hard to tell from these pre and post tests, in the middle is 
learned a great deal" about the complexity of student 's errors. I know that 
young students can get .th^se preconceiv.ed notions about how to do things 
and it's very hard to find a pattern to ^rieir ^rrofs but there is and I 
"believe that BUGGY convinced me of [it]. 

JShat JX. you' study the errors long enough you can^ eventuallv cpme up wj^th* a 
reasonable solution as to why th^"~terrof*3 is occurring. 

Through looking carefully at children's^ math errors it i^- sometimes 
•possible to discover a pattern to them. This pattern will tell you an area 
or a concept the\child does not understand. 

I learned ^that there could be. more to a child 's^'mistH^es other than 
carelessness. Working with children with spepial needs I h%v6 , encountered 
many such problems, yet never stopped to analyze what could b%a systematic 
problem — for this I thanl^ ygu. ^ ^* 

Children do have problems and^ they ar6 very difficult to spot especially 
when a number of .different operations are used to come to an answer. ^ I^ve 
learned to b^more aware of how these, children 'reach these "answers" and';to 
hel'p theniXo correct them; first by knowing how they arrived at the answer I. 

Although many^ arithmetic3f*^errors may be careless, there may also be a 
pattern that the icid^is locked into.. If you pick up on a pattern you can 
test the child to' see if he/she conforms to it and^work on it from^ there"^ 

The types of analysis. necessary to "debug" student errors on the test 
(paiper/pencil) seems mo^e>. difficult than "with the computer^^/;: But that 
doesn',t make any sense. The "a^^iysis" ought to be the same. Perhaps the 
computer motivated my analytical ";^bility. 1 - . 

I found ^Tiat * I' tyive looked closer at the problems, looking for a 
relationship between the set after working with BUGGY. | 



How to perceive problems, that don't l^ok too consistent, a little easier. 
How to have a good time with a computer. (I've only played tic-tac-toe- at' 
the Science Museum,, and have always wanted to do more). Machines can ,h,e 
tenipermental' (when pestered by a large number of students?) • • 

I leai'ned and was exposed to the many different types of problems children 
might have. I never realized. the many different ways a child could devise 
nis own system .to. do a problpte. I am- now aware of problems that fcould 
arise and I m sure this will help me [in] my future career as a teacher. 

How to more effectively detect "problems" stydents have with place value. 

That you can find causes of a child's Iproblem without the child's work in 
front of you. In looking for the' "bug",- up and down aren't the only 
possibilities, also diagonally. I suppose horizontally also. HoW 9pecific 
•the pr-obl.em might be — only works in one situation. 

U have learned several ne.w possible errors Students ' may • make- in 
computation. I have also learned somewhat how to diagnose, these errors, 
i.e. what to look Sor, and how specific errors can b?. ^ ■ 

* - ^ * 

I think I learned more about , computers and hoV tause them".. Also I learned 
about diagnosing math difficulties. It makes me aware of problems that 
Children- have and they sometimes think logically, not carelessly as 
sometimes teachers think they do, ' ■ ■ a ^ 

I learned that: computers are very complicated pieces of machiner^y. If. one 
isn t experienced with .the mechanisms, then problems could ^ result. That 
computers can be an as^et to the classroom is hot doubted, .l?ut I think many 
problems can result. ' They can add much to a classroom untfl they 3tart 
breaking down. • ' 

» 

That there are many .problems that you can diagnose about a child by looking 
at his homework. ; ^ . Av^v^iv*titt 

If a child has repeatedly made [the] same mistakes', it is^-^^e easily 
Identified if the teacher has an opportunity to try and itf^'^fethe] same 
mistakes. This method can be solved at least quicker than.^.Hw 

• * * 

Computers are concise. Information can be ' gathered arid stored 
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Tuned in to picking ^jiip malfunctions in simple addition and subtraction 
which seemed to be realistic problems. * \ . 
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V' . ^ , * Appendix 3 ' 

^ List of all responses to the question: 
What is your reaction to BUGGY? 

1 think it would be a fantastic resource for a school with a lot of money 
to spend. - i - '* 

\ ^ I 

Too early to tejl. But the potential seems stupendous. I enjoyed it and 
see it as a powerful future tool. A 

I .like it . 

• » 

Working with a partner is good for being forced to explain" (defend) your 
theory [as long as partner require^S that]. Useful tool for those with 

' . pretty good nupjber ability^ - What "Stbput^ those who don't have^^ggpd feeling 

for numbers? ^ ^-CT"^ 

Good! J! Forces one to get very specific answer to the pro.blem. You can be 
slightly wrong and then> rather moving way off base in jtour second theory 
as to the problem, you pinpoint /modify your first ('assunfiri'g it's'^ almost 
r^^^hft"). Bad. It's too much fua_.and I wasn't being very professional in\mu' 
U^ge (though binder different situation I might). , / 

I think this system is fantastic. It's- a wonderful way to expose people 
/ ^(who are involved With children) to the problems children will prQbably 

have^. , It might be especially useful with special learning needs children: 

It's great! When will it J>e in my "price" range? 



As. for the ^ame itself,, it would haye| been continued for another ^3 of^ ' ^ 
hours. . ^ ^ ' * . > 

I think, it's an excellent .device \f for*, tfying to diagnose some of the f 
difficulties iTound in mathematics. For*': a; teacher the time element — 
having the machine diagnosis would be more; practical. \/ 

It's a nice toy. ; ' ^ • t 

The Pug is great. Makes you stop and think^^ * . 

I enjoyed the BUGGY experience extensively. 'Solviitg or determining- errors : 
was much easier on the computer — and fun^ too! - • 

^ ' " ' ,i ' ' ^ ' . 

- ^I enjoyed working with BUGGY but when it break's down it is very 

frustrating. This^ might, be 'difficult for children to understand > that 
problems with computers do>rise. Also it may be complicated [for younger 
xjhildren^ to understand- hbw\to use it. High school students 'may' en joy it 
.though. ' ^ . " * , 

I think BUGGY would be a definite "plus" i^ the classroom. but right now 1 
. . feel there' ai*e too many "bugs" with BUGGY. Too many times did BUGGY go . 
cralsy. I find it amazing t'hough that a machine can he^lp^ one detect 
problems. It sure is a better , way than jthe - present i ^ , 

^^^BUGGY makes' one look at e.ach problem carefully and detect exactly what a 
child cannot do or cannot comprehend'" without formal testing. 

. As far as BUGGY is cor?cerned, I had a very^fgood time "playing" with. BUGGY. 

It was quicker and" somehow easier than pencil and paper. It took'less 
' concentration and was definitely more efficient. Can this be used- as^ a ^ 

strictly diagnostic tool? If "so, Ithink tha^ BUGGY is great. 

106 ^ \ ' 
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He's a trip! Seriousjy, tie's fine if you can master him 'in case he de^es 
to Dreaj< down, ^ 

.1 think BUGGY is a good. idea and would'like to hear about it. 

^^'^ ,f " P''°8''^'» that should be further researched and has excellent 
potential, _ . ' ^ ^ 

Great experience in beginning to play with computers — ' exercised problem 
focussing without frustrating .9 child with inadequate preparatiort. 

^^^^^ could ^ be used^ to sharpen a teacher's awareness of 
different difficulties with addition and' subtraction • It might be fun for 
the kids to play such a game together • ' . 



A. 
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Appendix 4 

'Wnis *app;endix presen^a aaswers and descriptions for some of the subtraction 
bugs for the problem: . ' 

M530O ' > ' .. ' 

-9522 . ^ 



5778 . ^ ..... 

. - ^. 

95778t When borrowing from a column which has a 1 on top, the student 
tr*eats the 1 as if Ut were' a 10.' . ^ 

27998: When borrowing is necessary, iastead of subtracting 1 from the 
top digit of the next columj^iVt the student adds f to it. 

24822: /The student adds instead of subtracts. 

16888: Wh.en the stud^t needs to borrow, he adds 10 to the top digit of 
the current * column without subtracting 1 from the top digit of the next 
column. 

1577^*: The student borrows borrectly except he doesn't take 1 from trie 
•top digits that are over blanks. 

14822:- The student adds without carrying instead of subtracts. ^ 

14378: Th6 student subtracts the smaller' digit*^ in a column from' the 
larger digit regardless of which is^ on top.* 

and No iq^tter what other bugs *the. student may have, he performs the 
units column correctly even if it requires borrowing. 

14222: The student subtracts the. smaller digit in each column from the 
larger regardless of^'which is on top.- The exception is when 10 .iq in the 
deft-most columns of the top number;, in this c&se 10 is treated .Like a" 
single digit. . - 

14222: The student subtracts the smaller digit in a column from thre 
larger digit regardless of which is/on ^top. 

14200: ;rhe student* subtr^acts the smaller digit in each column from the 
larger digit regardless oY which is on top'. The exception is wiien the 
top digit is d, in which case a 0^ is written as the answer for tJfat 
column, i.e. 0-N=0. 

10022: student do^sn t. know- how to borrow. If- the top* digi^ in ^ , 

column is.O, the student writes the bottom digit in the~a?iBWer *( i.e. 
0-N=N): If the top digit is smaller t\\^n the bottom digit , then 0 is 
written in the answer.^ ^ ^ 

10000: The student writes a^^.O in any^ column in which borrowing is ' 
needed, . . , 

8748: The student gets 6 and 9 mixed up when decoding ( reading ) the^ 
digits in the problem, misreading 6 for 9, and 9 for:A6. 

7998: When borrowing^ from a column, the student borrows' from the ^larger 
digit .disregarding whether it is the top pr the bottom digit. 

* \ 

6888: The student will only borrow from a column \n which the top. digit 
is larger. In the columns^e skips ( where ^ the tep dl^yt* is smaller ) 
he automatically adds 10 to the top digit.,"/" - 



108 ! 



tb22:\ The student borr^ows frofp the next column to the left which ha^ a 
larger top digit • Any intervening columns have 10 added to tffeir top ^ 
dig'it.^The exception is when 0 i-^ on top in which case the .student ' ' 
writes tlje bottom dumber in the*answer (e»g: -0-N=N) . 

5878:. / When borrowing from a column whose top digit is 0, the stucjent 
writes 9, but does not continue bor/'owing from the column to the; left of 
the 0. . , , , . . ■ ' ^ 

, * • . 

.5822? Whenever the top digit in a x^oli^nn^fe 0, the student writes the 
bottom digit in the answer, i.e. 0-N=N. 

5800: Whenever the top digit in 'a column is 0, the student writes 0 ,in 
the answer, .i.e. 0-N=0. ' ' ♦ ' 

5798: ' When borrowing from a column with 0 on. top, the student borrows 
froa the bottom digit instead 'of the 0 on top^ In all other cases* the 
student borrows correctly. / ^ 

,5788: The* student forgets to change 10 to 9 lifter borrowing into a 
column whose top digit is 0^ ^ " • 

^ ^* * 

* 5688: When the ^student needs to borrow from a column whose top digit is 
0, he skips that column and, borrows from the next one. 

5678: Once the student needs to borrow from a column, he continues to' 
borrow into every column whether he needs to or not.^ - ' 

5372: When faced with borrowing, the student decrements the next column 
correctly, but instead of adding te'nj to the top digit of the current 
column,, he simply subtracts the smaHer digit' from the larger digit even 
a thoujgh the ^s^iailer digit is on top^ " . ' 

^822: The student adds instead of subtracts, but when carrying he 
subtracts the carry from the top digit of the next column instead of 
adding it. ^.^^^ 

M222: The student subtracts the smaller digit in a ccSlumn from the 
larger digit regardless of which .is oh top. 

and The ^student stops working the problem as soon as the "bottom number 
runs out. 



U8 

109 



